By integrating cloud computing, database, and
peer-to-peer (P2P) technologies, BestPeer achieves its query
processing efficiency in a pay-as-you-go manner and is a promising approach for corporate
network applications. BestPeer adopts the Software-as-a-Service
(SaaS) paradigm and is deployed as a service on the cloud.
We have deployed BestPeer on Amazon’s EC2
cloud platform. To form a corporate network, companies first
register their sites with the BestPeer service provider. Then,
they can launch BestPeer instances (i.e., Amazon EC2 virtual
machine instances) on the cloud and upload data to those
instances for sharing. BestPeer adopts the pay-as-you-go
business model popularized by cloud computing. The total cost of
ownership is therefore eliminated since companies do not have to
buy any hardware/software in advance. Instead, they pay for what
they use in terms of BestPeer instance’s hours and storage
capacity. The BestPeer service provider elastically scales up
the running instances and makes them always available.
Therefore, companies can use the Return on Investment (ROI)
driven approach to progressively invest on the data sharing
system.
|
| The BestPeer network consists of two kinds of
running instances: one single bootstrap peer and many normal peers. The
bootstrap peer is the entry point of the whole network, which is managed by
the service provider. This peer is responsible for monitoring and managing
normal peers registration and scheduling various network management events.
The bootstrap peer also maintains meta data of the corporate network
applications that the service provider serves. For each corporate network
application, the bootstrap peer stores the shared global schema and a list
of normal peers which are participants of that corporate network.
Normal peers are the BestPeer instances launched by
businesses. Each normal peer is owned and managed by a unique business and
serves the data retrieval requests issued from the users of the owning
business. To meet the high throughput property, BestPeer does not rely on a
centralized server to locate which normal peer hold which tables. Instead,
the normal peers are organized as a balanced binary tree structured
peer-to-peer overlay network called BATON. The query processing is, thus,
performed in entirely a distributed manner. |