by Antonio M
In this article the author writes about Peer-to-Peer(P2P) data integration and
proposes a way to manage the schema information in P2P systems. He first introduces
a multiple layer structure forP2P systems which will manage metadata information
and register data sources. In this P2P system architecture the nodes are
separated into 2 levels. The first level is the schema managing level and the
second level is the source level. In the nodes in the schema managing level
are known as management nodes. The nodes in the data source level are known as
source nodes. Management nodes are used to manage scheme information and
other metadata from data nodes. While data nodes submit data integration
requirements and provide data services. Each management node is in charge of a group
of source nodes and each source node belongs to one management node. Management
nodes also communicate amongst other management nodes and can share schema and
metadata of the source nodes they are in charge of. Source nodes also can
communicate amongst themselves and can share data for data integration.
In this proposed P2P system data integration begins after a data source node has
been submitted for integration. The Data will then transfer its requirements to
its management node, which will then be called a start management node. The
start management node will then pass its requirements to other management nodes
that will then search within there group of data source nodes for the specified
requirement from the initial data source node. After some of the management
nodes have located data source nodes matching the specified requirements, they
will then notify the start management node. The start management node will then
evaluate the returned sets of data source nodes using an algorithm the author
has proposed. This algorithm will then select the set of data sources for
executing the data integration.
At first this article was a little confusing to comprehend but with some time and
patients I had been able to understand what the author was trying to convey. I
think that this P2P system that was proposed is a great idea and way to help
maintain Data Quality integration within a P2P system.
I think this pertains to what we are learning in class because it deals with
data integration and the ability to try and maintain data integration. I think
this is good information for any student that is trying to learn a thing or two
on how to design a P2P system that can have quality data through integration.
Aside from trying to follow and understanding the algorithms used in identifying the
qualified data sources I found the ideas proposed in this article very useful.
Zhigang Zhao; , “Data Quality-Oriented Data Integration in Peer-to-Peer System,”
Hybrid Intelligent Systems, 2009. HIS ’09. Ninth International Conference on , vol.2,
no., pp.419-422, 12-14 Aug. 2009 doi: 10.1109/HIS.2009.199