Peer-to-Peer Systems Data Integration

by Antonio M
In this article the author writes about Peer-to-Peer(P2P) data integration and
proposes a way to manage the schema information in P2P systems. He first introduces
a multiple layer structure forP2P systems which will manage metadata information
and register data sources. In this P2P system architecture the nodes are
separated into 2 levels. The first level is the schema managing level and the
second level is the source level. In the nodes in the schema managing level
are known as management nodes. The nodes in the data source level are known as
source nodes. Management nodes are used to manage scheme information and
other metadata from data nodes. While data nodes submit data integration
requirements and provide data services. Each management node is in charge of a group
of source nodes and each source node belongs to one management node. Management
nodes also communicate amongst other management nodes and can share schema and
metadata of the source nodes they are in charge of. Source nodes also can
communicate amongst themselves and can share data for data integration.

In this proposed P2P system data integration begins after a data source node has
been submitted for integration. The Data will then transfer its requirements to
its management node, which will then be called a start management node. The
start management node will then pass its requirements to other management nodes
that will then search within there group of data source nodes for the specified
requirement from the initial data source node. After some of the management
nodes have located data source nodes matching the specified requirements, they
will then notify the start management node. The start management node will then
evaluate the returned sets of data source nodes using an algorithm the author
has proposed. This algorithm will then select the set of data sources for
executing the data integration.

At first this article was a little confusing to comprehend but with some time and
patients I had been able to understand what the author was trying to convey. I
think that this P2P system that was proposed is a great idea and way to help
maintain Data Quality integration within a P2P system.

I think this pertains to what we are learning in class because it deals with
data integration and the ability to try and maintain data integration. I think
this is good information for any student that is trying to learn a thing or two
on how to design a P2P system that can have quality data through integration.
Aside from trying to follow and understanding the algorithms used in identifying the
qualified data sources I found the ideas proposed in this article very useful.

Reference:

Zhigang Zhao; , “Data Quality-Oriented Data Integration in Peer-to-Peer System,”
Hybrid Intelligent Systems, 2009. HIS ’09. Ninth International Conference on , vol.2,
no., pp.419-422, 12-14 Aug. 2009 doi: 10.1109/HIS.2009.199
URL: http://0-ieeexplore.ieee.org.opac.library.csupomona.edu/stamp/stamp.
jsp?tp=&arnumber=5254497&isnumber=5254374

2 thoughts on “Peer-to-Peer Systems Data Integration”

  1. I find your article to be very interesting, I personally use Peer to Peer from time to time. And when I think about Peer to Peer I never have related data base to it ever. The reasons being that I see Peer to Peer as a real time application, I never knew data bases can relate to Peer to Peer. It seems like it is true that database relates to anything that we do, then it makes me think the article I did for this week, which talks about security issues with databases, both technical and non-technical.

  2. Confusing indeed…..but yes, with patience I suppose many things are possible. I always found P2P networking capabilities interesting because it utilizes the computing power on the devices of the masses as opposed to a centralized power. Although many of our own peers (myself included) cannot yet comprehend exactly what your author is explaining, in time we will probably come to realize it's importance and become excited about learning his new technique for data integration.

    And it's more than likely that anything involving one or more complicated algorithms will produce something really cool, really useful, or both.

    Neat find, cheers!

Comments are closed.