Google Page Rank eigen vectors
Group members: James Aviles
Contents
Summary
Googles page rank is defined by using the eigenvalues of a matrix which is defined by the connections a website has TO and FROM itself (also known as a node network.
By defining the probabilty of being on a given node as the inverse of the sum of its connections and then multiplying this matrix with the original matrix we achieve the google page rank matrix (which is a matrix of the proportions of time spent at a given node.)
We then compute and pull the eigenvector corresponding to eigen value = 1 from the set of eigenvectors retrieved using matlabs eig() function
Example
imshow('1.png') example =[0 0 0 1 0 % here we define the network according to connections 1 0 1 0 0; % F R O M 1 1 0 1 0; % T 0 0 1 0 1; % O 0 1 1 1 0] %
example = 0 0 0 1 0 1 0 1 0 0 1 1 0 1 0 0 0 1 0 1 0 1 1 1 0

D = diag(sum(example,1))% Define matrix D as the sum of a nodes connections G=example*inv(D) % Define Google rank matrix as the the node matrix % multiplied by the inverse of the Sum of nodes % matrix (which defines the probability) [eigvec,eigval]=eig(G) % extract the eigen vectors and values from G v=eigvec(:,1) % Take the eigen vector corresponding to eigen value of 1 v=v/sum(v) % Normalize that vector to represent probabilities
D = 2 0 0 0 0 0 2 0 0 0 0 0 3 0 0 0 0 0 3 0 0 0 0 0 1 G = 0 0 0 0.3333 0 0.5000 0 0.3333 0 0 0.5000 0.5000 0 0.3333 0 0 0 0.3333 0 1.0000 0 0.5000 0.3333 0.3333 0 eigvec = Columns 1 through 4 0.2175 + 0.0000i 0.0353 - 0.4315i 0.0353 + 0.4315i -0.3122 + 0.0000i 0.2610 + 0.0000i -0.4689 + 0.1415i -0.4689 - 0.1415i 0.3660 + 0.0000i 0.4567 + 0.0000i -0.3074 + 0.0000i -0.3074 - 0.0000i -0.3484 + 0.0000i 0.6525 + 0.0000i 0.6149 + 0.0000i 0.6149 + 0.0000i 0.6968 + 0.0000i 0.5002 + 0.0000i 0.1262 + 0.2901i 0.1262 - 0.2901i -0.4021 + 0.0000i Column 5 0.2209 + 0.0000i 0.4417 + 0.0000i -0.7730 + 0.0000i -0.2209 + 0.0000i 0.3313 + 0.0000i eigval = Columns 1 through 4 1.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0386 + 0.4718i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0386 - 0.4718i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i -0.7438 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i Column 5 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i -0.3333 + 0.0000i v = 0.2175 0.2610 0.4567 0.6525 0.5002 v = 0.1042 0.1250 0.2188 0.3125 0.2396
In the example case, the fourth node is ranked the highest
Problem 1
We have summarized the above code into the function 'prank'
We must now solve for the google page rank of the following matrix
imshow('2.png')

problem1=[0 0 1 0 0 0 % define the matrix 1 0 0 0 0 0; 0 1 0 1 1 0; 1 0 1 0 0 1; 0 1 0 0 0 0; 0 0 1 0 1 0] prank(problem1) % Solve for the google page rank
problem1 = 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 1 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 v = 0.1176 0.0588 0.3529 0.3088 0.0294 0.1324 ans = 0.1176 0.0588 0.3529 0.3088 0.0294 0.1324
For problem 1, the third node is ranked the highest
Problem 2
Do the same for this next network
imshow('3.png')

problem2=[0 0 1 0 0 0 0 1 0 0 0 0 0 0; 0 1 0 1 0 0 0; 0 0 1 0 0 0 1; 0 0 0 1 0 1 0; 0 0 0 0 1 0 1; 0 0 0 1 1 0 0] prank(problem2)
problem2 = 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0 1 1 0 0 v = 0.0455 0.0455 0.0909 0.1364 0.2727 0.2273 0.1818 ans = 0.0455 0.0455 0.0909 0.1364 0.2727 0.2273 0.1818
For problem 2, the fifth node is ranked highest
the following network is of my own creation
imshow('4.png')

problem3=[0 0 0 1 0 1 0 1 0 0; 1 1 0 0 0; 1 1 0 0 1; 1 0 1 0 0] prank(problem3)
problem3 = 0 0 0 1 0 1 0 1 0 0 1 1 0 0 0 1 1 0 0 1 1 0 1 0 0 v = 0.2857 0.1429 0.1429 0.2857 0.1429 ans = 0.2857 0.1429 0.1429 0.2857 0.1429
I had initially planned on checking the case for the page rank of node 1 when it was connected to all nodes but no nodes were connected to it. however this was throwing an error because node 1 would have a page rank value of 0
upon connecting node 5 to 4, and node 4 to 1, every node then obtained a value which allowed the program to continue.
interestingly node 4 and 1 have the same page rank, because node 4 needs only to connect to node 1 in order to piggy back off of its connections to increase its page rank.