Forum for discussion of mid, final term and final project

[Epidemic Protocols] Doubts regarding cache updating

[Epidemic Protocols] Doubts regarding cache updating

by FRANCESCO LANDOLFI -
Number of replies: 5

I have a doubt about how we update the entries of the cache: I understand that after selecting a peer, the two nodes exchange their caches, merge them, and then remove the oldest entries such that the cache has a predefined size c.

My question is: when updating, should we overwrite the previous entries from the same peer, if there was already one in the cache? Or should we just add another entry and maybe have multiple entries from the same peer?

I ask this question because it seems that, if c = 20 ~ 50, after some simulation steps, a grid graph becomes disconnected (in other words, more than 1 connected components). Allowing multiple entries of the same peer cause each node to exploit circa half of its cache to get its neighbors, since the observed average degree of the nodes is ~c/2 (not allowing multiple edges between two nodes).

The paper "Newscast Computing" by Jelasity et al. is not very clear about this, but it states that

"The graph Dt is c-regular, that is, all correspondents have cache size c and therefore c
outgoing edges (provided the number of correspondents is larger than c
)."

and also

 "[...] there is a directed edge from correspondent c1 to c2 if and only if the address of c2 is contained in the cache of c1."

so the c outgoing edges in the Dt graph should connect to c distinct nodes, which is highly improbable if we allow repetitions.

My implementation may be wrong, but I want to specify that:

  • the random selection of the nodes is not biased by the number of entries from the same peer (each peer is counted as 1 and selected uniformly at random;
  • the random selection exclude the possibility that a peer selects itself;
  • at each simulation step, the nodes are sorted by the instant in which they were last updated (from the oldest to the most recently updated), and then re-updated in that order (sequentially).
I attach four pictures representing the evolution of a grid graph of 10×100 nodes. After just 3 steps, the graph becomes disconnected.
Attachment step-0.png
Attachment step-1.png
Attachment step-2.png
Attachment step-3.png
In reply to FRANCESCO LANDOLFI

Re: [Epidemic Protocols] Doubts regarding cache updating

by FRANCESCO LANDOLFI -

Ok, I tried avoiding peer repetitions in the same cache and now it works like a charm, even with c down to 15 (the paper showed results only with c ≥ 20)!

In reply to FRANCESCO LANDOLFI

Re: [Epidemic Protocols] Doubts regarding cache updating

by Laura Ricci -

Just for a better understanding: the first graph is the grid graph and the other 3 graphs are the components after the graph becomes disconnected?

In reply to Laura Ricci

Re: [Epidemic Protocols] Doubts regarding cache updating

by FRANCESCO LANDOLFI -

No, the first graph represents the initial state of the network: it has the default layout generated by Cytoscape, but you can actually see that, if you "unroll" it, it's a grid. The other 3 pictures represent the states of the same graph respectively after 1,2, and 3 steps of the newscast algorithm.

In the 4th picture (i.e., the 3rd step) you can see that there is a small disconnected cluster of 3 nodes in the upper part of the figure.

The algorithm does not provide a method to reconnect automatically those nodes, so it will remain disconnected for the whole simulation. Moreover, as the simulation goes further, the graph becomes even more disconnected, as depicted in the attached picture where, after 20 steps, the graph has 114 connected components.

By the way, this is what happened when I allowed multiple cache entries by the same peer, and it seems this what caused the problem.

Attachment step-20.png
In reply to FRANCESCO LANDOLFI

Re: [Epidemic Protocols] Doubts regarding cache updating

by DAVIDE RUCCI -

Personally I think that we should not have repetitions in the caches. I may be wrong but I see repetitions useless, like in the midterm (we are talking about peer connections, so there is no benefit in having two or more connections to same peer, you just need one connection).

Also, I'm not sure if it is correct that a peer exchanges the view with itself, it should do it with a peer selected at random from its view, and its view should not contain itself.

In reply to DAVIDE RUCCI

Re: [Epidemic Protocols] Doubts regarding cache updating

by FRANCESCO LANDOLFI -
I agree with you,  the cache should not contain repetitions, also because otherwise the algorithm fails (as showed before).

Also, I agree that a peer should not select itself, since it would be a null cycle: the peer exchanges the cache with itself, so each entry is replaced with the same version of itself, and so on. 

I do not agree that the cache should not contain a copy of the data of the peer owner of the cache: not because it's wrong (in fact it's not only useless but disadvantageous, since it wastes a cache position and reduces the degree of the node by 1), but because, unlike the cache repetition problem which it's unclear but can be deduced from the text, the fact that a peer contains an entry of itself is explicitly stated in the paper:

Each correspondent executes the following five steps once every ∆T time units.
  1. Request a fresh news item from the local agent by calling getNews(). Create a new cache entry and insert it into the cache.
  2. [...]