We needed a distributed hosted phone system that provides the availability and scalability required in business grade communications. Voxbit needed to reimagine the IP phone system. We have built a robust hosted VoIP system which is distributed, scalable and secure.
To achieve our objective we needed to divide the project into five initial areas.
- Application layer
- Distributed Core Infrastructure
- Shared state Messaging Layer
- Edge service layer
- Distributed hosted phone system application
The application layer sees each function of a phone system as an autonomous service application (or module). Each application is a refactoring of some of the core programming from a PBX as a standalone application that will share its state across a distributed network. This has the benefit of enabling each user to scale certain services without having the expense (memory & CPU) of having to scale the entire system. Each of these standalone modules needs GUI’s for users, resellers, network management, and customer care. There is an extensive list of applications that we have already built in this way including Voicemail, IVR’s, time Conditions, Hold & Transfer, Call Recording, Follow me, Operator Panel, Queue, Hunt Group, Click to Dial, Phone Book, missed Call, Auto Dialler and Credit control. The current build of each of these application modules and the GUI are based on existing Opensource technology. However, to share the state of each user and calls across the distributed network required development at an individual modular level.
The Distributed Core network, allows bare metal machines to be automatically configured and loaded with the necessary software to provide services required as demand or network managers dictate. This required new techniques for managing the networks and core infrastructure as services such as DNS and DHCP are built for static networks. In order to share real time data, Voxbit needed to develop new caching protocols to manage the performance of the WAN. As existing technologies constrain the amount of data based on available bandwith and latency, we needed to work on real time compression of data to manage the WAN more effectively.
Once the Virtual Network was developed, we need the ability to use the network data to provide the appropriate resources to customers and the call traffic. Our co-ordination and scheduling protocols allow us to dynamically shape the servers we have available to provide the services required.
Some of the significant challenges we need to overcome in this area are address space conflicts, and management of many unique routing tables. To get the system to act as a unified whole, we need to develop ways that we can share the current state of calls, customers, and the applications they use. Rather than try and keep a single copy of the data, with all of the synchronising delays that creates, we will be developing some of the techniques from parallel computing, such as Optimistic Replication. We are still working to improve some of our message brokers, and testing the effectiveness of various messaging protocols.
Our edge services are really challenged by mobility. As consumers become device independent, and phone number independent, they will be using a wide variety of devices from an even wider array of IP addresses to access the services they need. This is where we really get pay off for the foundations that we have built above. Customers can access their voicemails, optimise their call routing, and join calls on a single phone system from many different locations. Dynamic configuration to ensure effective NAT traversal will present new challenges, as the system adapts to the demands of a particular client configuration or the users software or firewall needs.
We have brought all of these segments together into one single state. At an engineering level the shared state segment ties all of the services together. However from all of the users and managers of the system it is the GUI that controls the system, bringing it all together. On this project there are an enormous amount of GUI’s development required as we provide access and manage each application. We will also provide network management GUI so that we can manually intervene and manage the network in the event that we need to override any part of the system.
The Segments of the Project
Segment 1: The Application Layer
The initial stage of the project is to establish the autonomous application modules in the platform through the development of the media services & distributed API framework. Within an iterative process, the following key areas of focus have been identified:
Step 1: Application API Framework
With the software driven nature of the Platform, the API framework is a fundamental component of the overall architecture. It will provide the basis for all inter-module and media service communication, all future development will stem from this step.
Step 2: Publish/Subscribe Interfaces
Based on a shared-nothing architecture, the application layer will allow each autonomous module to discover & communicate with one another. Using the Messaging Layer (Segment 3) in a pub/sub manner to facilitate the distribution of state or event data throughout the Application Layer & Edge Services Layer (Segment 4).
Step 3: Feature Modules
The development of application modules provides the end-user functionality in the form of service functionality and features. Designed to function independently or in concert with others, these modules establish the commercial goals of the project.
Step 4: Extend & Integrate
This step outlines extending the application layer further by integrating with new and existing business services which are required for the daily operation of the platform. These include essential integration work such as billing control, fraud detection services, and security policies; as well as end-user graphical interfaces, and back-office administration & support tools.
Segment 2: Distributed core infrastructure
The distributed core infrastructure segment is split into two core elements.
(a) Core Infrastructure
(b) Network Exchange
The distributed core infrastructure will form the foundation of the entire environment for the overall IP platform. The shared state messaging layer (Stage 3) will use the distributed core to augment the delivery of the edge services (Stage 4). The application layer (the end products) will use these public facing services as ‘on net’ connectivity to deliver the voice media services.
Segment 2 : (a) Core Infrastructure
This will provide the Points Of Presence (POPs) that use both the bare metal (physical devices) and hypervisor technology to provide a means of addressing software defined networking principles to deliver the IP platform.
The process for research and development of these principles will be as follows:
Step 1 : Dispersed Storage System
The platform requires research into how it can use a dispersed storage system(s) so that the data can be shared across a cluster of bare metal. The overall principles will be established based on a shared voting architecture and provide the framework for scalability.
Step 2 : Network Services
The existing network services designed for managing networks and controlling core infrastructure have a number of limitations. These will have a direct impact on the objective of a distributed platform. Specific examples of limitations include DNS and DHCP, both of which were originally designed for static networks. The project will have to research, define and redevelop the current adopted methods of structured networks to enable the hypervisors to exploit the distribution and scale of the network. At this stage, there is little or no information available to support this. The outcomes of this element will clearly determine the overall impact on Stages 3 & 4 and the capability of the platform to maximise its distribution potential.
Step 3 : WAN Compression
To ensure that the platform delivers geographical diversity to support resilience through high availability reducing latency with local access, the project needs to investigate methods of routing real time data between POPs. Traditional networks will exploit caching technology to reduce the volume of data, limiting the volume of transfers, to help optimise the performance of the WAN. In voice networking, real time data does not currently benefit from any of these caching technologies. This element of the project will explore the potential use of compression and deduplication for real time voice data routing across the WAN connections between POPs.
Step 4 : Actor Control
An actor is best described as the lowest common denominator within an autonomous system. This step in the project needs to investigate the current principles and best working practices to make use of the dispersed storage and network services research and development completed in Steps 1 & 2. These developments and associated outputs will provide the controls or the project to scale the platform across the geographic and resilient POPs. The actor controls will be used in defining horizontal and vertical deployments and will provide the definitions and triggers for the life cycle management of the platform.
Step 5 : Security
Traditional security principles are designed as preventative, to protect against known and predefined exploits. In this project the new IP platform will be dynamic and scalable introducing new challenges in the security layers. The new platform will need to be able to utilise new methods of dispersed storage, adopt principles of new network services, benefiting from real time voice data routing and deployed using a structured actor control based methodology. All current methods of security do not support this and are not sufficient to prevent potential unauthorised access and use of the applications layer. This step of the project will require research and development by utilising the actor controls to dynamically perform pattern recognition through an event driven method, across all ‘live’ instances in real time. The security layer in this step of the project needs to provide a framework based on the distributed core network as an entire entity. The security layer will then actively and dynamically recognise simultaneous threats and unauthorised access of services utilising the new developments built into the platform thanks to the outputs from steps 1 to 4 discussed previously.
Step 6 : Documentation
The core infrastructure segment will need to be fully documented and provided for the project.
Segment 2 : (b) Network Exchange
Moving beyond the statically defined infrastructure traditional network topologies provide, SDN (Software Defined Networking) principles remove the constraints typically associated with specialised hardware devices. Compute, storage, networking, security, and availability services are pooled, aggregated and delivered as software resources that can be dynamically managed in an event-driven or policy-driven manner.
Augmenting efforts previously outlined in Stage 2a (Core Infrastructure), while providing the foundations to work upon in Stages 3 & 4 (Messaging Layer and Edge Services respectfully), the result of research into the Network Exchange will provide the Platform with the engine required to intelligently route and deliver real-time voice data both within a POP and between geographically separate POPs.
The process of research and development of these principles will be as follows:
Step 1: Multi-tenant Encapsulation
Due to the limitations of VLAN and switching boundaries, including address-space conflicts commonly associated with network exchanges, current industry approaches are impractical and inefficient in supporting the platform. By exploring network data encapsulation in conjunction with SDN principles, this will provide the capability to create isolated, logical network segments within a multi-tenant environment spanning traditional network boundaries and supporting backwards compatibility.
Step 2: Namespace Routing
At present there are no means of managing isolated routing tables to the scale required for this multi-tenant network. This step of the project requires research into new techniques and methodologies to allow the peering and distribution of many unique routing tables. This namespace routing functionality will include and support the development of the Platform’s multi-tenant network described in Step 1.
Step 3: Flow Management
The fundamental goal of Software Defined Networking (SDN) is to dynamically manage the structure of a network in real-time. As a result, this goal directly supports the aims of a globally distributed network infrastructure. This step in the project needs to investigate new methods in identifying individual traffic streams as unique flows and providing the means of managing these flows in real-time. This step is directly linked and dependent on the outcomes from Steps 1 & 2 above.
Step 4: Documentation
The network exchange segment will need to be fully documented and provided for the project.
Segment 3 : Messaging Layer
Based on insight gained from previous work in deterministic latency, and borrowing from patterns traditionally found in parallel programming such as optimistic replication, the Platform avoids trying to maintain a single copy of data in favour of allowing the different components & frameworks to diverge. By no longer having to wait for all copies of data to be synchronized with a cluster of actors, a high-level of concurrency is supported at scale both inside and across geographic POPs.
This segment of the project will concentrate on developing and implementing the real-time messaging infrastructure that enables the adaptive, distributed nature of segments 1, 2, and 4 when completed.
Step 1 Message brokers
As a research and development exercise, this step requires investigation into existing middleware broker implementations; and evaluate the suitability of each in regards to distributing event data in publish/subscribe and peer-to-peer deployments for real-time and queued messages. It is expected that no existing broker implementation will be suitable for the Platform’s use case, however existing protocol specifications such as 0mq or RV may support the feature set required. The research and development will need to substantiate this.
Step 2 Distributed Shared-State
This step in the project requires the full integration, deployment and testing of the Message Brokers and messaging protocol with output of Stages 1 & 2 in order to distribute shared state and event data. As a result of this step the project will have implemented the distributed backbone of the Platform, forming the basis of one of the core goals set out in the project.
Step 3 : Documentation
The messaging layer segment will need to be fully documented and provided for the project.
Segment 4 : Edge Services
The edge services are used to enable access to the overall platform. These can be in the form of desktop IP phones, SIP trunks, web browsers and even mobiles. The number of endpoints is increasing and therefore it is important to establish a process to build this stage to cater for evolving requirements in the market. In order to support this project, the following process will be used to research and develop the edge services for the platform.
Step 1:Authentication and Translation
This will handle the development required for the authentication from multiple endpoints. In order to manage the identity of the SIP accounts across multiple devices and locations this step needs to investigate the potential URI’s (unique resource identifier) and URN’s (unique resource namespace) in order to translate SIP authentication into endpoint state.
Step 2 : Application Layer Integration
This step in the project will concentrate on the integration of edge services and the application layer.
Step 3 : Shared State
This step will provide a bridge between the SIP authentication and the business logic within the Application Layer. By publishing the shared state of the SIP endpoints across the core elements of the platform, it enables the cluster of session border controller nodes to effectively communicate enabling and end to end distributed system.
Step 4: Documentation
It is important to accurately finalise and publish the documentation across each of the steps in the Edge Services.
Segment 5: Distributed hosted phone system
As a result of the agile methodology used throughout the project, with the output of each iteration point integrated, the Platform fabric is realised on completion of the four segments outlined. This process will present and highlight new challenges to the project and may lead to further iterative cycles in order to resolve. Further research and development may also be required at this point into new and yet unknown interdependencies; however this will not be known until the Platform is established.
Step 1: Platform
While logically separated, each segment of the project demonstrates the interdependent nature of the Platform. As a result, research and development will be carried out in a feature-driven manner commonly known as BDD (Behaviour Driven Development) to reach the state of completion. This iterative process is based on a large number of task oriented sprints to aid in planning and velocity tracking.
Step 2: Business Integration
The nature of the agile methodology lends itself to continuous integration, supporting the commercial nature of the project. Implementing and deploying in many small, incremental sprints, allows for immediate functional testing and refinement. This includes not only features of the core platform but integration with existing business services such as billing and upstream carrier providers.
Step 3: Product Interfaces
Continuing with feature-driven development and the ideologies of BDD, each interaction and process of user facing interfaces is detailed as a story prior beginning a sprint. These stories not only form the basis of listing the requirements of each interaction and flow, but provide the framework for testable use cases to support the QA effort. This process will be applied not only to end user customer interfaces but back-office Platform management tools.
Step 4: Quality Assurance
Supported through the research and development process with clear outlines of requirements, engineering documentation, and iterative testing on completion of sprint; the final QA element involves regression testing on integration with the staged copy of the Platform, as well extensive performance and penetrative testing, prior to a live GA deployment.
Step 5: Documentation
It is important that the overall documentation is collated and the overall platform is documented as per the final deliverables.