Posts

Showing posts from 2013

Downloading Twits using Twitter Public Stream API

In this post will try to show how to connect to Twitter Public Stream API and download live Twits. We can save these Twits in any format like comma or pipe delimited. These Twits can be use for Data Mining or Data Analysis. Have used this Twits to analyse the current News in given region. This program creates files of 50MB each, and keep rotating till the time the stream is running. we can run Map-Reduce program using Hadoop Framework to analysis the data at much faster rate. Prerequisite: 1. Need to have Twitter Application 2. Twitter OAuth Tokens       https://dev.twitter.com/docs/auth/obtaining-access-tokens 3. Twitter4j Libraries      http://twitter4j.org/en/ Below are the twitter4j API used: TwitterStreamFactory  : This API returns the object of type  TwitterStream We need to set below authorization tokens to connect to Twitter Public Stream. You will find below tokens once you create Twitter Application. -Dtwitter4j.oauth.consumerKey=XX   -Dtwitter4j.oauth.co

Integrating Weblogic with IBM MQ over JMS

Image
We can integrate weblogic with IBM MQ over JMS by creating Foreign JMS server with Links in weblogic server. We need to create .binding file which will acts as file based JNDI provider to create JMS binding to MQ resources. Will first explain how to create .binding file and then will configure weblogic to use the same. Steps to create .binding file: Firstly we need to add below jar’s in CLASSPATH jms.jar com.ibm.mq.jar com.ibm.mqjms.jar jta.jar connector.jar jndi.jar providerutil.jar fscontext.jar Create JMSProperties.config file with below properties INITIAL_CONTEXT_FACTORY=com.sun.jndi.fscontext.RefFSContextFactory PROVIDER_URL=file:/tmp/mqm/jndi SECURITY_AUTHENTICATION=none Run below command ./JMSAdmin -v JMSProperties . config InitCtx> def qcf(ForeignCF) TRANSPORT(CLIENT) HOST(localhost) PORT(1414) CHANNEL(MQCHANNEL) QMGR(MQMRG) def q(MQ1) qmgr(MQMRG) queue(MQ1) def q(MQR1) qmgr(MQMRG) queue(MQR1) display ctx

Brief Overview on BigData and Hadoop

Big Data: The term Big Data applies to the information that can't be processed or analyzed using traditional processes. It’s basically processing terabytes of unstructured information to generate insights that are required for the businesses. Three basic characteristics of BigData: 1.     Volume : The data that is to be analyzed is in Terabytes. For example, Twitter alone generates around 7 terabytes (TB) data per day, Facebook around 10 TB/day. Analyzing this volume requires lot of hardware. 2.     Variety : The data is not only organized in traditional table patterns but also raw, semi structured, and unstructured data like weblogs, social media, blogs, sensors, images, videos etc. 3.     Velocity: It requires to analysis or process volume and variety data quickly. The velocity is very important as the data is still in motion, for example analysis real time market data, or analysis customer's browsing pattern while the time he is logged in. BigData case s