Open Source VoIP & ICT Solutions for Businesses Worldwide

WebRTC and Real-Time Unified Communications

#19 of 20 Innovations

WebRTC and Real-Time Unified Communications

WebRTC (Web Real-Time Communication) makes something that sounds simple actually work: real-time audio, video, and data between browsers, with no plugin required, no proprietary client to install, just the browser itself. That’s what made it the foundation for modern unified communications. Google Meet, Zoom’s web client, Microsoft Teams on the web, Discord – all built on WebRTC under the hood. The browser-native nature is a genuine competitive advantage for enterprise software: any web application can embed real-time communications with a few hundred lines of JavaScript and a signalling server. You don’t need to ship a desktop app.

WebRTC Connection with SFUBrowser AgetUserMedia()mic + cameraSTUN / TURNICE traversal · NAT punch-throughSFULiveKit · mediasoup · Jitsi1 upload stream per senderICTContact · ICTBroadcast agent callsBrowser BBrowser CBrowser DDTLS-SRTPencryptedSFU advantage: each sender uploads 1 stream — not N-1 streams as in pure P2P — scales to large groups

WebRTC is browser-native — no plugin install required. SFU routes media selectively for efficient large-group conferencing.

The WebRTC stack has three core APIs. getUserMedia captures audio and video from the microphone and camera. RTCPeerConnection manages the media connection: codec negotiation (VP8, VP9, H.264 for video; Opus for audio – Opus is particularly good at maintaining quality under packet loss), ICE traversal to punch through NATs and firewalls, DTLS-SRTP encryption, and bandwidth adaptation. RTCDataChannel creates bidirectional binary data channels alongside media streams for file sharing and real-time state sync. For group calls, pure peer-to-peer doesn’t scale – each participant would need to upload N-1 streams. That’s where SFUs (Selective Forwarding Units) come in: LiveKit, mediasoup, Janus, and Jitsi receive one stream per participant and forward them selectively, reducing each client’s upload burden to a single stream. ICTContact and ICTBroadcast integrate WebRTC so agents can take and make calls directly from their browser without installing a softphone – genuinely useful for simplifying contact centre deployments.

SFU vs MCU Architecture for Group Video CallsSFU (Selective Forwarding Unit)AliceBobCarolSFURoutes onlySFU: Each client uploads 1 streamLow server CPU · Best for 2-100 participantsMCU (Multipoint Control Unit)AliceBobCarolMCUMix all streamsserver-side1 composite1 composite1 compositeMCU: Server mixes all streams — high CPUSimple client · Higher latency · Server bottleneck

SFU dominates modern conferencing — LiveKit, mediasoup, and Janus all use SFU architecture. MCU is reserved for legacy interop scenarios.

The enterprise trend is toward communications-embedded applications: instead of switching to a separate meeting tool, users communicate inside their CRM, support platform, or project management tool. That’s possible specifically because WebRTC is browser-native and doesn’t require server-side media handling for the basic call path. The remaining technical challenges are worth knowing. Media quality under poor network conditions is still hard – packetloss concealment, adaptive bitrate, and jitter buffers are all built into browsers, but they’re imperfect. Large-group sessions beyond roughly 100 participants need careful SFU architecture. And end-to-end encryption in group calls (where even the SFU server shouldn’t decrypt streams) is an active area of development – the Messaging Layer Security (MLS) protocol is the most promising standardisation effort here.

Frequently Asked Questions

Does WebRTC require a server?

WebRTC uses servers for signalling (exchanging session descriptions and ICE candidates) and for STUN/TURN servers to handle NAT traversal. Media can flow peer-to-peer for two-party calls, but group calls almost always use an SFU or MCU media server to manage routing efficiently. WebRTC is often described as peer-to-peer, but production deployments always involve server infrastructure.

What is the difference between an SFU and an MCU in WebRTC conferencing?

An SFU (Selective Forwarding Unit) receives streams from each participant and forwards them individually without mixing. Each client receives separate streams and renders them itself. An MCU (Multipoint Control Unit) mixes all streams into a single composite before sending to each participant – reducing client processing but increasing server CPU cost significantly. SFUs dominate modern conferencing because they scale better and add less latency.

How does WebRTC handle poor network conditions?

WebRTC includes NACK (retransmission requests), FEC (proactive redundant data), PLI (keyframe requests on video decode failure), and GCC/REMB congestion control algorithms that adapt bitrate to available bandwidth in real time. These mechanisms are built into browser and SFU implementations and require no application-level code to activate.

Can WebRTC be used for softphone replacement in a contact centre?

Yes – this is one of the most common enterprise WebRTC deployments. A WebRTC softphone runs entirely in the browser, connects to a SIP gateway or SFU, and supports hold, transfer, conference, and DTMF. Platforms like ICTContact and ICTBroadcast use WebRTC this way, giving agents full calling capability directly from their browser-based interface without installing or maintaining a standalone softphone.