User Tools

Site Tools


gibson:teaching:spring-2018:math445:lab6

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
gibson:teaching:spring-2018:math445:lab6 [2018/03/21 09:17]
gibson
gibson:teaching:spring-2018:math445:lab6 [2018/03/21 11:41]
gibson
Line 31: Line 31:
 ''​[w,​L] = surfer('​http://​www.unh.edu',​M);''​ (or whatever website you choose). After some time this will generate a vector $w$ of websites and a matrix $L$ of links between them. All the entries of $L$ should be either 0 or 1.  ''​[w,​L] = surfer('​http://​www.unh.edu',​M);''​ (or whatever website you choose). After some time this will generate a vector $w$ of websites and a matrix $L$ of links between them. All the entries of $L$ should be either 0 or 1. 
  
-**(b)** Now from the matrix $L$, generate a transition matrix $A$ that governs the dynamics of people clicking on random links, according to $x^{n+1} = A x^n$. Similar to the hamster dynamics problem above, the $i$th component of the vector $p^n$ is the fraction of websurfers visiting the $i$th webpage at time $n$. You will probably run into problem, namely a webpage might contain no links or only links outside the set of pages we are working with. In that case we will just assume that the person randomly selects a new website among all the $M$ pages in our set. Revise your calculation of $A$ so that if a column of $L$ sums to zero, then each of the entries in the same column of $A$ is $1/M$.+**(b)** Now from the matrix $L$, generate a transition matrix $A$ that governs the dynamics of people clicking on random links, according to $x^{n+1} = A x^n$. Similar to the hamster dynamics problem above, the $i$th component of the vector $x^n$ is the fraction of websurfers visiting the $i$th webpage at time $n$. You will probably run into problem, namely a webpage might contain no links or only links outside the set of pages we are working with. In that case we will just assume that the person randomly selects a new website among all the $M$ pages in our set. Revise your calculation of $A$ so that if a column of $L$ sums to zero, then each of the entries in the same column of $A$ is $1/M$.
  
 **%%(c)%%** Suppose that instead of always clicking on a random link within a page, a web surfer sometimes (with probability $\alpha$) jumps to a random page chosen from all the pages in the network. We can model this behavior by using a modified transition matrix $B$ and dynamics $x^{n+1} = B x^n$, as follows **%%(c)%%** Suppose that instead of always clicking on a random link within a page, a web surfer sometimes (with probability $\alpha$) jumps to a random page chosen from all the pages in the network. We can model this behavior by using a modified transition matrix $B$ and dynamics $x^{n+1} = B x^n$, as follows
Line 41: Line 41:
 where $[1]$ is an $M \times M$ matrix of ones. Rumor has it that Google uses $\alpha = 0.15$, so use that value and calculate $B$.  where $[1]$ is an $M \times M$ matrix of ones. Rumor has it that Google uses $\alpha = 0.15$, so use that value and calculate $B$. 
  
-**(d)** Double-check that the sum of each column of $B$ is 1. Again, be clever and get Matlab to do the work, rather than listing out the sums of all the columns and verifying manually that each of the 100 numbers is 1! +**(d)** Double-check that the sum of each column of $B$ is 1. Again, be clever and get Matlab to do the work, rather than listing out the sums of all the columns and verifying manually that each of the $M$ numbers is 1! 
  
 **(e)** Let’s assume that all websurfers start at the first webpage, ''​x=zeros(M,​1);​ x(1)=1;''​. If we iteratively apply ''​B''​ to ''​x''​ many times (say 40 times), the resulting vector will give the probability that we end up on a particular website and after a long session of random web surfing. We can do this with a for-loop **(e)** Let’s assume that all websurfers start at the first webpage, ''​x=zeros(M,​1);​ x(1)=1;''​. If we iteratively apply ''​B''​ to ''​x''​ many times (say 40 times), the resulting vector will give the probability that we end up on a particular website and after a long session of random web surfing. We can do this with a for-loop
Line 56: Line 56:
  
 **(f)** ​ If you are interested: Our calculation involved two free parameters: the probability $\alpha$ of jumping to an entirely random page in the network, and $N$, the number that specifies the length of our long session of random web surfing. How robust is the page rank algorithm with respect to these two parameters? If you change $\alpha$ to 0.1 or 0.05, do the top 10 pages change? **(f)** ​ If you are interested: Our calculation involved two free parameters: the probability $\alpha$ of jumping to an entirely random page in the network, and $N$, the number that specifies the length of our long session of random web surfing. How robust is the page rank algorithm with respect to these two parameters? If you change $\alpha$ to 0.1 or 0.05, do the top 10 pages change?
-How about if you change ​''​M'' ​to 100 or 1,000?+How about if you change ​$Mto 64128, 256, 512, or 1024(some of these larger values might take a really long time). ​
  
 ---- ----
gibson/teaching/spring-2018/math445/lab6.txt · Last modified: 2018/03/21 11:41 by gibson