Art of creation

Sunday, 24 March 2013

Many Eyes - Data Visualization tool by IBM

Introduction

Many Eyes, a data visualization tool, is a SaaS offered via Cognos and IBM.
Many Eyes is a "shared visualization and discovery" service.
It is another rich way to present data and content on the Web.
It allows the entire internet to upload data, visualize it, and talk about their discoveries with other people
Users can upload their own datasets, and/or work with existing datasets on the site.
There are a lot of great options to visualize the datasets.
When you need to create stunning interactive pictures quickly, IBM’s Many Eyes is a great solution.

Creating a visualization

1. 1. Now, let us see how to create visualizations using Many Eyes.

Create a login at Many Eyes. Here is the url for you.

http://www-958.ibm.com

2. 2. Now, you have to upload the data set or you can choose any of the existing data sets on Many Eyes and create visualization. For uploading your own data set, go to ‘Upload a data set’ under ‘Participate’ head on the left side of the web page.

If your data is a list of values, first format it into a table with informative column headers. If your columns have different units of measure, be sure to include the units in the headers. Use a spread sheet program such as Microsoft ExcelTM or a text file where columns are separated with tabs.

This is the data set I have chosen - Market share of mobile companies in Q4

Company	4Q12 Market Share (%)	4Q11 Market Share(%)
Samsung	22.70	19.60
Nokia	18.00	23.40
Apple	9.20	7.40
ZTE	3.40	4.00
LG Electronics	3.20	3.50
Huawei Device	2.90	2.90
TCL Communication	2.40	2.20
Lenovo	1.80	1.10
Sony Mobile Communications	1.70	1.90
Motorola	1.70	2.10
Others	33.20	31.80

3. 3. Copy the above text and paste it in the rectangular space provided

4. 4. Provide the title for the data and other information in the following screen. Please note that only title is mandatory and all other fields are optional. Click on create

5. 5. The final data set will look as follows

6. 6. Click on Visualize to select the type of visualization you want for your data. For example, I have chosen "Matrix type" for which the visualization is shown as below

Wide ranges of customization options are available and you can choose different options as per your requirement. Several other visualization types like bubble charts, hisograms, network diagrams, pie charts, line graphs etc. are also available

Pros

· - Huge variety of visualizations

· - Allows analysis of complex data

· - No software to be download and installed

· - Explore others’ datasets and visualizations

Cons

· - Once you upload the data you cannot edit the values

· - Might crash the browser

Friday, 15 March 2013

IT lab session 8

Assignment 8 : Panel Data Analysis

We will be doing Panel Data Analysis of "Produc" data

We will be analyzing on three types of model :
      Pooled affect model
      Fixed affect model
      Random affect model

Commands:

Loading data:
> data(Produc , package ="plm")
> head(Produc)

Pooled Affect Model

> pool <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("pooling"), index = c("state","year"))
> summary(pool)

Fixed Affect Model:

> fixed <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("within"), index = c("state","year"))

> summary(fixed)

Random Affect Model:

> random <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("random"), index = c("state","year"))

> summary(random)

Comparison

The comparison between the models would be a Hypothesis testing based on the following concept:

H0: Null Hypothesis: the individual index and time based params are all zero

H1: Alternate Hypothesis: atleast one of the index and time based params is non zero

Pooled vs Fixed

Null Hypothesis: Pooled Affect Model

Alternate Hypothesis : Fixed Affect Model

Command:

> pFtest(fixed,pool)

Result:

Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.

Pooled vs Random

Null Hypothesis: Pooled Affect Model

Alternate Hypothesis: Random Affect Model

Command :

> plmtest(pool)

Result:

Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Random Affect Model.

Random vs Fixed

Null Hypothesis: No Correlation . Random Affect Model

Alternate Hypothesis: Fixed Affect Model

Command:

> phtest(fixed,random)

Result:

Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.

Conclusion:

So after making all the comparisons we come to the conclusion that Fixed Affect Model is best suited to do the panel data analysis for "Produc" data set.

Hence , we conclude that within the same id i.e. within same "state" there is no variation.

Wednesday, 13 February 2013

IT Lab Session 6

Assignment 1: Find the historical volatility and log of returns data

Assignment 2: Create ACF plot of log returns and do Augmented Dickey-Fuller test

Thursday, 7 February 2013

IT Lab Session 5

Assignment 1: Download data set for large NSE data (at least 6 months) and generate returns having selected the 10th datapoint as start and 95th data point as end

Assignment 2) Predict the data for 701 to 850 rows for the data given
Ans) Considered "age" and "ed" attributes as categories and worked accordingly

Wednesday, 23 January 2013

IT Lab Session 3

Assignment 1a: Create a linear regression model and analyze the impact of groove on mileage. Comment about the applicability of this model based on residual plot.

From the residual plot we can infer that the linear regression model cannot be applied for the sample data.

Assignment 1b: Create a linear regression model and analyze the impact of alpha on pluto. Comment about the applicability of this model based on residual plot.

From the above residual plot we can infer that the linear regression model can be applied.

Assignment 2: Using Anova compare the mean comfort level of 3 different kinds of chairs.

As P value is > 0.05 we can conclude that all the 3 chairs provide the same comfort level.

Wednesday, 16 January 2013

IT Lab session 2

Assignment -1 : Create 2 matrices, select 2 columns, one each from the matrices and combine them into a new matrix using CBIND function.

Assignment -2: Multiplication of 2 matrices

Assignment-3: Generate regression data from 01 Dec 2012 to 31 De 2012 NSE data

Assignment-4: Generate normal distribution data and plot it

Tuesday, 8 January 2013

IT Lab Session 1

Assignment - 0 : Plot a line using 3 points

Assignment - 1 : Plot a histogram of the data obtained from a local file

Assignment - 2 : Plot the points in the form of lines and points. Done using the "plot" command with type="b". Also name the graph and label the x,y axes.

Assignment - 3: Plot a scatter plot for the data chosen from a local file.

Assignment - 4 : Find the range of values in 2 different columns merged together.