First interaction Artificial Neural Network

First interaction Artificial Neural Network


I hated biology in my school days and loved mathematics. After a long period of time I get to learn something which combines both mathematics and biology together, that is Artificial Neural Network short for ANN, inspired by biological Neural network. Though you might find it weird, that is how I would like to define the artificial neural network. When we say biology here, it is basically the study of brain or perhaps the nervous system. How nervous system works, Artificial intelligence just mimics that. Neural network is getting popularity hugely now a days with bigdata by its side. Infact one of newly joined colleague said, you cannot do artificial neural network or any other machine learning algorithm without bigdata but of course I didn’t believe him and decided to try it myself. So rest of whatever will be in this blog are from the first interaction of mine with…

View original post 784 more words


Resolving the Failure Issue of NameNode


In the previous blog “Smattering of HDFS“, we learnt that “The NameNode is a Single Point of Failure for the HDFS Cluster”. Each cluster had a single NameNode and if that machine became unavailable, the whole cluster would become unavailable until the NameNode is restarted or brought up on a different machine. Now in this blog, we will learn about resolving the failure issue of NameNode.

Issues that arise when NameNode fails/crashes-
The metadata for the HDFS like Namespace Information, block information etc, when in use needs to be stored in main memory, but for persistence storage, it is to be stored in disk. The NameNode stores two types of information:
1. in-memory fsimage – It is the latest and updated snapshot of the Hadoop filesystem namespace.
2. editLogs – It is the sequence of changes made to the filesystem after NameNode started.

The total availablity of HDFS

View original post 310 more words

Multiple Feeds at one place: MultiFeed App

Multiple Feeds at one place: MultiFeed App


jianOkay, what if i tell you, there is an app :D, ever feel about having an App where you can add all your interesting blogs feeds ?

Here it is, may be there are many other available in play store but this one is simple actually very simple, just add your interesting blog feed url, get top 20 feeds as a simple list, click the post read that in the app close the app. Done. You earned the skill.

Here is a simple running image to show the working of this multi feed app.

Can’t wait ? Ok just download it from here for free: Play Store


Keep Learning Keep Sharing 🙂

View original post

Scala Map


Scala Map is a collection of Key-value pair. A map cannot have duplicate keys but different keys can have same values i.e keys are unique whereas values can be duplicate.

Maps in Scala are not language syntax. They are library abstractions that you can extend and adapt.

Scala provides mutableand immutable alternatives for maps. Class hierarchy for scala maps is shown below:

Screenshot from 2017-04-28 18-34-09 

Image is taken from Programming in Scala by Martin Odersky

There’s a base Map trait in package scala.collection, and two subtraits – a mutable Mapin scala.collection.mutableand an immutable one in scala.collection.immutable . By default Scala Map is immutable but if you want to use mutable one then you need to import mutable Map by using statement : import scala.collection.mutable

View original post 1,282 more words

Starting HiveServer2 Programmatically


HiveServer2 (HS2) is a server interface that enables remote clients to execute queries against Hive and retrieve the results (a more detailed intro here). The current implementation, based on Thrift RPC, is an improved version of HiveServer and supports multi-client concurrency and authentication. It is designed to provide better support for open API clients like JDBC and ODBC.

In this Blog we will learn how can we use HiveServer2 With java,Do not Confuse yourself there is no requirement of hive on your machine,we are creating a stand alone HiveServer2 which will run in embedded mode,so lets get started

for this purpose i am using sbt first create a sbt project with intellij now add following dependencies in build.sbt

libraryDependencies += "org.apache.hive" % "hive-exec" % "1.2.1" excludeAll ExclusionRule(organization = "org.pentaho") libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "2.7.3" libraryDependencies += "org.apache.httpcomponents" % "httpclient" % "4.3.4" libraryDependencies += "org.apache.hadoop" % "hadoop-client" %…

View original post 798 more words

Smattering of HDFS


Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers.It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant as it provides high-performance access to data across Hadoop clusters. Like other Hadoop-related technologies, HDFS has become a key tool for managing pools of big data and supporting big data analytics applications.It is the primary storage system used by Hadoop applications.
HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware.It provides high throughput access to application data and is suitable for applications that have large data sets.
HDFS uses a master/slave architecture where master consists of a single NameNode that manages the file system metadata and one or more slave DataNodes that store the actual data.

What are NameNodes and DataNodes?

The NameNode

View original post 312 more words

Getting Started with Apache Cassandra


Why Apache Cassandra?

Apache Cassandra is a free, open source, distributed data storage system that differs sharply from relational database management systems.
Cassandra has become so popular because of its outstanding technical features. It is durable, seamlessly scalable, and tuneably consistent.
It performs blazingly fast writes, can store hundreds of terabytes of data, and is decentralized and symmetrical so there’s no single point of failure.
It is highly available and offers a schema-free data model.


Cassandra is available for download from the Web here. Just click the link on the home page to download the latest release version and Unzip the downloaded cassandra  to a local directory.

Starting the Server:

View original post 221 more words