Windows

Web applications with webrtc. Camera in browser

OpenTok, PubNub and WebRTC

Cloud platforms OpenTok and PubNub for developing communication services based on WebRTC

2016-04-08

Nowadays, Web Video Chats are becoming more and more popular. Web Video Chat is a web application for interactive communication built on the basis of a Web server and browsers.

Of all the existing technologies for building Web video chats (Ajax; Java; Flash technologies; ASP.Net + Silverlight; HTML5 + JavaScript based on the WebRTC API, etc.), the most promising technology is the WebRTC API. Web chats built on the basis of WebRTC technology provide high-quality transmission of text, voice, video and data (files) without installing additional plug-ins or extensions in browsers. The main elements of a WebRTC video chat are a browser and a contact server.

A browser that supports WebRTC becomes a single interface for all user devices (PCs, smartphones, iPads, IP phones, mobile phones, etc.) that work with communication services. WebRTC with WebSocket, HTML5, CSS3, and JavaScript enables the creation of next-generation web communication services. WebRTC technology is implemented by three JavaScript APIs.

For peer-to-peer operation, two browsers that support WebRTC need to access the signaling server (for example, a WebSocket server) running on node.js by ip-address. The server does not take part in the transfer of information flows between browsers, but is a signal server and is designed to establish a connection between users' browsers.

Due to the fact that not all hosts support WebRTC, to implement communication applications based on the WebRTC API and integrate them into web nodes (websites) on the Internet, you can use special platforms that support WebRTC and provide APIs and SDKs.

The API provides interaction of communication applications with the platform (Web service) that provides this API, and the SDK ensures the development of communication applications that can interact with the platform that provides this SDK.

These platforms include: OpenTok from TokBox, PubNub, VoxImplant, Twilio, SkyWay from NTT Communications, Kandy.io, SightCall, etc. It should be noted that to create a contact server, you can deploy Node.js for web communication applications on a rented virtual VPS server or use cloud platform hosting (PaaS) that support projects based on Node.js, for example, OpenShift/Red Hat, Heroku/ Salesforce, AWS Elastic Beanstalk/Amazon, etc.

In addition, to create a communication application, you can purchase the Flashphoner Web Call Server platform (server software designed to organize browser-based online broadcasts of audio and video streams), which is based on HTML5 Websockets, WebRTC and Flash technologies, and install it on the web. -server under OC Linux or rented virtual VPS server. Currently, Flashphoner has deployed its WebRTC server (Web Call Server 5) on Amazon Web Services cloud hosting.

V this review Let's consider the most popular cloud communication Web services based on WebRTC technology - OpenTok from TokBox and PubNub.

OpenTok from TokBox

OpenTok is a cloud-based PaaS (Platform as a Service) platform that is a leading communications web platform RTC for integration into websites and mobile applications video communications and messaging tools. OpenTok has a distributed infrastructure that contains data centers around the world.

TokBox's OpenTok open platform provides developers with the ability to embed cross-platform WebRTC API-based video chats into Web applications (websites), Java/Android and IOS applications.

The OpenTok architecture consists of the client part of the WebRTC OpenTok library (for example, OpenTok.js), which provides embedding video communication in the client part of the application (web page) and a set of tools (OpenTok Client SDKs) for developing client applications (JavaScript / Web applications, Java OS /Android and IOS).

Moreover, integral part OpenTok architectures are server SDKs (OpenTok Server SDKs), which are designed to develop a server infrastructure (for managing and authenticating users) of an application that provides dynamic generation of unique OpenTok session IDs (sessionId) and tokens (token) for each user, as well as the operation with OpenTok archives. The web server passes the appropriate session ID and token, which the client can then use to connect to the session.

OpenTok server SDKs are available for major programming languages server side applications: Java, .NET, Node.js, PHP, Python, Ruby. It should be noted that if the server side of the application requires another programming language that is not included in the enumerated list, then TokBox provides an OpenTok REST API for creating OpenTok sessions and working with OpenTok archives.

Thus, to create communication applications for Internet nodes, you should use the OpenTok server SDKs in combination with the WebRTC OpenTok client library and the OpenTok Client SDKs tools.

To use the OpenTok platform to create a WebRTC video chat built into an Internet host, you first need to create an account on TokBox.com. A free OpenTok account is valid for thirty days. A registered user has the ability to obtain an API key (ApiKey) required to develop an OpenTok communication application. The API key identifies the OpenTok developer account.

Using the OpenTok Developer Guide (https://tokbox.com/developer/guides/) in your TokBox developer account, you can create a communication application (video chat web page) using the OpenTok Client SDKs. To use the OpenTok framework for your application, you must include the OpenTok.js library in your web page.

https://static.opentok.com/webrtc/ .../opentok.js

The session identifier (SESSIONID) and token (token) necessary for the operation of the application are usually created programmatically on web server with one of the server SDKs (OpenTok Server SDKs).

But to create a test version of the application without OpenTok server SDKs, you can, based on the API key (ApiKey), get the session ID and the token for this session ID manually using the developer toolbar. The client needs a token that gives it access to the session.

Examples of such variables as key "apiKey", session ID "sessionId" and token "token" look like this:

var apiKey = "17493650";
var sessionId= "2_MX40NT...tWXR-UH4";
var token= "T1==cGFyd...2RhdGE9";

After creating a session object with a session identifier (SESSIONID) and a token (token), the application initializes the session object.

var session = OT.initSession(apiKey, sessionId);

Then the client is connected to the session and the audio and video streams are published:

session.connect(token, function(error) (
publisher = OT.initPublisher("publisher");
session.publish(publisher);
});

After a client connects to a session, the application initializes the Publisher OpenTok object and publishes the audio and video stream for that session so that other clients can see it.

session.on((
streamCreated: function(event) (
session.subscribe(event.stream, "subscriber");
}});

Thus, in accordance with the developer's guide, you can create the basis of a WebRTC video chat based on the OpenTok platform, which can be implemented on your site. Figure 2 shows the screenshot interface of this WebRTC video chat created in the TokBox developer account.

Next, you need to create server part video chat using OpenTok Server SDKs for one of the programming languages. OpenTok Server SDKs allow you to programmatically create OpenTok sessions, generate tokens, and work with OpenTok archiving.

It should be noted that TokBox uses two modes of transferring media streams (Media Streams):

relayed (relaying), in this mode, media streams are transmitted directly between peerings (for example, between users' browsers in a one-to-one video chat);
routed (routing), this mode uses OpenTok Media Router to route audio-video streams between clients (for example, in a multi-user or group video chat for online meetings).

PubNub

PubNub is a global streaming network for: IoT (Internet of Things), Mobile and Web. PubNub is a cloud-based real-time web messaging service that is designed to organize web communications between various platforms: mobile phones, tablets, web browsers, websites, etc. PubNub provides over 70 SDKs for major programming languages to create cross-platform communication applications and embed them in Web applications (websites) and applications mobile devices(Java/Android and IOS).

The list of supported languages and SDKs by the PubNub service is presented on the page: https://github.com/pubnub/pubnub-api. It should be noted that in addition to APIs to specific software platforms PubNub also supports REST APIs. For example, the PubNub WebRTC SDK is designed to organize web communications (peer-to-peer networks) in real time between browsers using a peer-to-peer architecture. The architecture of interaction between the components of a communication application based on the PubNub platform and WebRTC technology is shown in fig. 3.

As follows from the scheme of interaction between WebRTC video chat components (Fig. 3), the PubNub platform is used as a scalable signaling server (negotiation server) for WebRTC applications. In addition, the PubNub platform provides the following additional features as presence (providing information about users available on the network or an up-to-date list of users), storage/replay (allowing users to see the history of past conversations over a period of time) and registration.

In communication WebRTC applications x, based on the PubNub framework, two methods of messaging (WebSockets and AJAX) are used between the browser and the negotiation server. PubNub offers a new API for connecting a WebRTC application to the PubNub platform. The PubNub WebRTC API performs signaling between users' browsers to allow them to connect in a peer-to-peer architecture using the WebRTC PeerConnection API. After exchanging signaling messages between browsers, duplex communication is established between them to exchange video streams and arbitrary data. Browser communications are coordinated by PubNub.

The PubNub service provides not only the interaction of all components necessary for establishing peer-to-peer communication between browsers for the purpose of real-time messaging, but also provides them with global network streaming data.

To use the PubNub video chat platform, you first need to register with PubNub to create a free account. Registered user can get API keys subscribe_key and publish_key required to develop a PubNub communication application. You can then add features such as security, presence, and storage to your account.

After obtaining API keys, you can start creating a communication application based on the PubNub platform using SDKs in one of the main programming languages, or use demo applications (templates). Tutorial to create a communication application is presented on the page: https://www.pubnub.com/docs/tutorials/pubnub-publish-subscribe. A reference guide for creating a video chat based on the PubNub WebRTC SDK is reviewed at https://www.pubnub.com/docs/webrtc-javascript/pubnub-javascript-sdk.

To create a WebRTC video chat from scratch, according to the tutorial, you need to use a few simple JavaScript APIs:

connect the PubNub library to the HTML page, i.e. include JavaScript PubNub SDK in the HTML code of the page before client initialization;
init() - initialize the PubNub client API;
subscribe() - subscribe to a specific channel (call the subscribe() PubNub API method);
publish() - send a message to a specific channel (call the publish() method of PubNub API);
unsubscribe() - Unsubscribe from a specific channel.

The initialization of the PubNub client API can be represented as follows:
var PUBNUB = PUBNUB.init((
publish_key: "Your Publish Key",
subscribe_key: "Your Subscribe Key"
});

To create a WebRTC video chat based on PubNub WebRTC, you can use a template with an open source code: https://www.pubnub.com/developers/demos/webrtc/. To check the operation of this video chat, you need to contact the specified address from two PCs, in the video chat interface that opens in browsers, users are assigned phone numbers. To communicate, users must enter phone numbers in the "Type Recipient"s" text box and click on the handset button.

As a result, images from video cameras are transmitted to browsers and displayed on monitor screens. In addition, this video chat functions as a chat to send text. To chat, users must enter text in the "chat here" field and press the "Enter" key. Figure 4 shows a screenshot of the video chat of the user with phone number 164.

Rice. 4.PubNub with WebRTC

Figure 5 shows a screenshot of the video chat of the user with phone number 128.

Rice. 5.PubNub with WebRTC

User-Designed web interface communication application is developed using HTML5 and CSS3 hypertext markup. The code of the client part of the communication application is developed in JS. The following libraries were connected to the Web communication application: JQuery, PubNub JavaScript SDK and PubNub WebRTC SDK.

Hi friends, as you already know, we update you regularly with new technologies, today I will introduce WebRTC, a technology developed by Google that allows users to speak directly in the browser video and audio without requiring that the use of plugins- Websites or applications. Video and audio direct connection between users takes place directly in the browser.
WebRTC technology supported by Mozilla Firefox browsers Google Chrome on any operating system, with Opera joining soon.
What is WebRTC and what?
WebRTC is short for Web Real Time Communication, this technology allows you to open audio and video chats directly in the browser without the need for other plug-ins, applications or services on the Internet for this. The connection is made directly from the browser to the browser.
Where known services (Skype, Yahoo Messenger, Apple FaceTime, Google Hago, etc.) require a server that connects users in order to initiate and manage traffic. Using these services we need to register and set up a list of clients and contacts.
With WebRTC, we don't need servers, applications, or servers that connect to intercede.
WebRTC advantages:
1. No more apps consuming resource and battery usage.
2. Chats are more private (relatively).
3. Contact can be done locally, not Flos US servers for local connections.
4. Simplicity, ease of use.
5. The possibility of further development, and in other directions.
6. Communication is stable and does not depend on external connections, which are sometimes extremely unstable.
In the tutorial, I used a demo that people at Google have developed, this demo is quite simple, more advanced features and faster connections can use one of the applications that support WebRTC, they are easier to use. Soon we will be making a tutorial about WebRTC applications as well.
How to use the WebRTC demo?
Very simply click on the link below, it automatically generates a chat. to link this room, you must send a friend / girlfriend you want to get in touch.
Friend / girlfriend and yours, but you should only use the most latest versions Mozilla Firefox or Google Chrome.

Demo WebRTC(Introductory chat audio - video)

Attention:
The demo is not very stable, it is made for demonstration purposes only. It can be used for a limited period of time during which small connection errors may occur.
If you're having connectivity issues, try creating a different chat.

WebRTC(Web Real-Time Communications) is a technology that allows Web applications and websites to capture and selectively transmit audio and/or video media streams, as well as exchange arbitrary data between browsers, without the need for intermediaries. The set of standards that WebRTC technology includes allows data exchange and peer-to-peer teleconferencing without the user having to install plug-ins or any other third-party software.

WebRTC consists of several interrelated programming interfaces (APIs) and protocols that work together. The documentation you'll find here will help you understand the basics of WebRTC, how to set up and use a data and media connection, and more.

Compatibility

Because WebRTC implementations are in the making and every browser has WebRTC functionality, it's highly recommended that you use the Adapter.js polyfill library from Google before you start working on your code.

Adapter.js uses wedges and polyfills to seamlessly bridge differences in WebRTC implementations among contexts that support it. Adapter.js also handles vendor prefixes and other property naming differences, facilitating the WebRTC development process with the most consistent result. The library is also available as an NPM package.

For further exploration of the Adapter.js library, see .

Concepts and usage of WebRTC

WebRTC is versatile and, together with , provides powerful multimedia capabilities for the Web, including support for audio and video conferencing, file sharing, screen capture, identity management, and interoperability with legacy phone systems, including support for DTMF tone dialing. Connections between nodes can be created without the use of special drivers or plug-ins, and often without intermediate services.

The connection between two nodes is represented as an object of the RTCPeerConnection interface. Once a connection is established and opened using an RTCPeerConnection object, media streams ( MediaStream s) and/or data channels ( RTCDataChannel s) can be added to the connection.

Media streams can consist of any number of tracks (tracks) of media information. These tracks, represented by objects of the MediaStreamTrack interface, can contain one or more media types, including audio, video, text (such as subtitles or chapter titles). Most streams consist of at least one audio track (one audio track), or video track, and can be sent and received as streams (real-time media) or saved to a file.

Also, you can use the connection between two nodes to exchange arbitrary data using the RTCDataChannel interface object, which can be used to transfer service information, stock data, game status packages, file transfers or private data channels.

more details and links to relevant guides and tutorials needed

WebRTC interfaces

Because WebRTC provides interfaces that work together to perform various tasks, we divided them into categories. See the sidebar alphabetical index for quick navigation.

Connection setup and management

These interfaces are used to set up, open and manage WebRTC connections. They represent peer-to-peer media connections, data channels, and interfaces used to exchange information about the capabilities of each node in order to select the best configuration when establishing a two-way multimedia connection.

RTCPeerConnection Represents a WebRTC connection between local computer and remote host. Used to handle successful data transfer between two nodes. RTCSessionDescription Represents session parameters. Each RTCSessionDescription contains descriptions of type indicating which part (offer/response) of the negotiation process it describes, and an SDP session descriptor. RTCIceCandidate Represents the Internet connection establishment (ICE) server candidate for establishing an RTCPeerConnection connection. RTCIceTransport Represents information about the Internet Connectivity Facility (ICE). RTCPeerConnectionIceEvent Represents events that occur on ICE candidates, typically RTCPeerConnection . One type is passed to this event object: icecandidate . RTCRtpSender Controls the encoding and transmission of data through an object of type MediaStreamTrack for an object of type RTCPeerConnection . RTCRtpReceiver Controls the receipt and decoding of data through an object of type MediaStreamTrack for an object of type RTCPeerConnection . RTCTrackEvent Indicates that a new incoming object of type MediaStreamTrack has been created and an object of type RTCRtpReceiver has been added to the RTCPeerConnection object. RTCCertificate Represents the certificate that the RTCPeerConnection object uses. RTCDataChannel Represents a bidirectional data channel between two connection nodes. RTCDataChannelEvent Represents events that are raised when an object of type RTCDataChannel is attached to an object of type RTCPeerConnection datachannel . RTCDTMFSender Controls the encoding and transmission of Dual Tone Multi-Frequency (DTMF) signaling for an object of type RTCPeerConnection . RTCDTMFToneChangeEvent Indicates an incoming DTMF tone change event. This event does not bubble (unless otherwise specified) and is not cancelable (unless otherwise specified). RTCStatsReport Reports the status asynchronously for the passed object of type MediaStreamTrack . RTCIdentityProviderRegistrar Registers an identity provider (idP). RTCIdentityProvider Enables the browser to request the creation or validation of an identity declaration. RTCIdentityAssertion Represents the remote host ID of the current connection. If the node has not yet been installed and confirmed, the interface reference will return null . It does not change after installation. RTCIdentityEvent Represents an identity provider (idP) declaration of an identifier event object. Event of an object of type RTCPeerConnection . One type is passed to this identityresult event. RTCIdentityErrorEvent Represents an identity provider (idP) associated error event object. Event of an object of type RTCPeerConnection . Two types of error are passed to this event: idpassertionerror and idpvalidationerror .

Guides

Overview of the WebRTC Architecture network protocols and connection standards. This review is a showcase of these standards. WebRTC allows you to set up a node-to-node connection to transfer arbitrary data, audio, video streams, or any combination of them in the browser. In this article, we'll take a look at the life of a WebRTC session, from establishing a connection and going all the way to ending it when it's no longer needed. WebRTC Overview The WebRTC API consists of several interrelated APIs and protocols that work together to provide support for the exchange of data and media streams between two or more nodes. This article provides a brief overview of each of these APIs and their purpose. WebRTC Basics This article will walk you through building a cross-browser RTC application. By the end of this article, you should have a working data and media channel running point-to-point. WebRTC Protocols This article introduces the protocols to which the WebRTC API has been created. This guide describes how you can use a node-to-node connection and a linked

WebRTC allows you to implement real-time audio / video communication through a browser

In this topic, I will tell you how to implement the simplest WebRTC application.

1. getUserMedia - getting access to media devices (microphone / webcam)

Nothing complicated, with 10 lines of javascript code you can see and hear yourself in the browser (demo).

Create index.html :

You can apply css3 filters to the video element.

The sad thing here is that at this stage of WebRTC development, I cannot tell the browser “I trust this site, always give it access to my camera and microphone” and you need to click Allow after each page opening / refresh.

Well, it would not be superfluous to recall that if you gave access to the camera in one browser, the other will receive PERMISSION_DENIED when trying to access it.

2. Signaling server (signal server)

Here I break the sequence of most of the "webrtc getting started" instructions, because they demonstrate the capabilities of webRTC on one client as a second step, which personally only added confusion to the explanation for me.

The signaling server is the WebRTC coordinating center that provides communication between clients, connection initialization and closing, and error reporting.

The signaling server in our case is Node.js + socket.io + node-static, it will listen on port 1234.
Plus, node-static can give away index.html, which will make our application as simple as possible.

In the application folder, install the necessary:

npm install socket.io npm install node-static

WebRTC (Web Real Time Communications) is a standard that describes the transfer of streaming audio data, video data and content from the browser and to the browser in real time without installing plugins or other extensions. The standard allows you to turn the browser into a video conferencing terminal, just open a web page to start communication.

What is WebRTC?

In this article, we will cover everything you need to know about WebRTC technology for regular user. Let's consider the advantages and disadvantages of the project, reveal some secrets, tell you how it works, where and what WebRTC is used for.

What you need to know about WebRTC?

The evolution of video standards and technologies

Sergey Yutsaitis, Cisco, Video+Conference 2016

How WebRTC works

On the client side

The user opens a page containing an HTML5 tag
The browser requests access to the user's webcam and microphone.
The JavaScript code on the user page controls the connection parameters (IP addresses and ports of the WebRTC server or other WebRTC clients) to bypass NAT and Firewall.
When receiving information about the interlocutor or about the stream with the conference mixed on the server, the browser starts negotiating the audio and video codecs used.
The process of encoding and streaming data between WebRTC clients (in our case, between the browser and the server) begins.

On the WebRTC server side

A video server is not required for data exchange between two participants, but if you want to combine several participants in one conference, a server is required.

The video server will receive media traffic from various sources, convert it and send it to users who use WebRTC as a terminal.

The WebRTC server will also receive media traffic from WebRTC peers and pass it on to conference participants using desktop or mobile applications, if any.

Benefits of the standard

No software installation required.
Very high communication quality thanks to:
- Use of modern video (VP8, H.264) and audio codecs (Opus).
- Automatic adjustment of stream quality to connection conditions.
- Built-in echo and noise cancellation.
- Automatic level control of participants' microphones (AGC).
High level of security: all connections are secure and encrypted according to TLS protocols and SRTP.
There is a built-in mechanism for capturing content, such as the desktop.
Ability to implement any control interface based on HTML5 and JavaScript.
The ability to integrate the interface with any back-end systems using WebSockets.
An open source project - you can embed it in your product or service.
True cross-platform: the same WebRTC application will work equally well on any operating system, desktop or mobile, provided that the browser supports WebRTC. This saves a lot of resources for software development.

Disadvantages of the standard

To organize group audio and video conferences, a videoconferencing server is required that would mix video and audio from participants, because the browser does not know how to synchronize multiple incoming streams with each other.
All WebRTC solutions are incompatible with each other, because the standard describes only methods for transmitting video and sound, leaving the implementation of methods for addressing subscribers, tracking their availability, exchanging messages and files, scheduling, and other things for the vendor.
In other words, you will not be able to call from a WebRTC application of one developer to a WebRTC application of another developer.
Group conference mixing requires a lot of computing resources, so this type of video communication requires the purchase of a paid subscription or investment in its infrastructure, where each conference requires 1 physical core of a modern processor.

WebRTC Secrets: How Vendors Benefit From Disruptive Web Technology

Tzachi Levent-Levi, Bloggeek.me, Video+Conference 2015

WebRTC for the video conferencing market

Increase in the number of videoconferencing terminals

WebRTC technology has had a strong influence on the development of the video conferencing market. After the release of the first browsers with WebRTC support in 2013, the potential number of video conferencing terminals around the world immediately increased by 1 billion devices. In fact, each browser has become a videoconferencing terminal that is not inferior to its hardware counterparts in terms of communication quality.

Use in specialized solutions

Using Various JavaScript Libraries and APIs cloud services with WebRTC support makes it easy to add video support to any web projects. In the past, real-time data transmission required developers to learn how the protocols work and use the work of other companies, which most often required additional licensing, which increased costs. WebRTC is already actively used in services like “Call from the site”, “Online support chat”, etc.

Ex-users of Skype for Linux

In 2014, Microsoft announced the end of support for the Skype for Linux project, which caused great annoyance among IT professionals. WebRTC technology is not tied to the operating system, but is implemented at the browser level, i.e. Linux users will be able to see WebRTC-based products and services as a full-fledged replacement for Skype.

Competition with Flash

WebRTC and HTML5 became a death blow for Flash technology, which was already going through its best years. Since 2017, the leading browsers have officially stopped supporting Flash and the technology has finally disappeared from the market. But you have to give Flash credit, because it was he who created the web conferencing market and offered the technical capabilities for live communication in browsers.

WebRTC video presentations

Dmitry Odintsov, TrueConf, Video+Conference October 2017

Codecs in WebRTC

Audio codecs

To compress audio traffic in WebRTC, Opus and G.711 codecs are used.

G.711- the oldest voice codec with a high bitrate (64 kbps), which is most often used in systems traditional telephony. The main advantage is the minimal computational load due to the use of lightweight compression algorithms. The codec has a low level of compression of voice signals and does not introduce additional audio delay during communication between users.

G.711 supported large quantity devices. Systems that use this codec are easier to use than those based on other audio codecs (G.723, G.726, G.728, etc.). In terms of quality, G.711 received a score of 4.2 in MOS testing (a score of 4-5 is the highest and means good quality, similar to the quality of voice traffic transmission in ISDN and even higher).

Opus is a codec with low encoding delay (from 2.5 ms to 60 ms), support for variable bitrate and high level compression, which is ideal for audio streaming over variable bandwidth networks. Opus is a hybrid solution that combines best performance codecs SILK (voice compression, elimination of distortions of human speech) and CELT (audio data coding). The codec is freely available, developers who use it do not need to pay royalties to copyright holders. Compared to other audio codecs, Opus certainly wins in many ways. It has eclipsed quite popular low bitrate codecs such as MP3, Vorbis, AAC LC. Opus restores the "picture" of sound closer to the original than AMR-WB and Speex. This codec is the future, which is why the creators of WebRTC technology included it in the mandatory range of supported audio standards.

Video codecs

The issues of choosing a video codec for WebRTC took the developers several years, in the end they decided to use H.264 and VP8. Almost all modern browsers support both codecs. Video conferencing servers need only support one to work with WebRTC.

VP8 is a free video codec with an open license, featuring high video stream decoding speed and increased resistance to frame loss. The codec is universal, it is easy to implement it into hardware platforms, so developers of video conferencing systems often use it in their products.

Paid video codec H.264 became known much earlier than his brother. This is a codec with a high degree of compression of the video stream while maintaining high video quality. The high prevalence of this codec among hardware video conferencing systems suggests its use in the WebRTC standard.

Google and Mozilla are actively promoting the VP8 codec, while Microsoft, Apple and Cisco are actively promoting H.264 (to ensure compatibility with traditional systems video conferencing). And here a very a big problem for developers of cloud WebRTC solutions, because if all participants in the conference use the same browser, then it is enough to mix the conference once with one codec, and if the browsers are different and there is Safari / Edge among them, then the conference will have to be encoded twice with different codecs, which is twice raise system requirements to the media server and, as a result, the cost of subscriptions to WebRTC services.

WebRTC API

WebRTC technology is based on three main APIs:

(responsible for the web browser to receive audio and video signals from cameras or the user's desktop).
RTCPeerConnection(responsible for the connection between browsers to “exchange” media data received from the camera, microphone and desktop. Also, the “duties” of this API include signal processing (clearing it from extraneous noise, adjust the microphone volume) and control which audio and video codecs are used).
RTC Data Channel(provides two-way data transfer over an established connection).

Before accessing the user's microphone and camera, the browser asks for this permission. In Google Chrome, you can pre-configure access in the "Settings" section, in Opera and Firefox, the choice of devices is carried out directly at the time of access, from the drop-down list. Permission prompt will always appear when using HTTP protocol and once if using HTTPS:

RTCPeerConnection. Each browser participating in a WebRTC conference must have access to this object. Thanks to the use of RTCPeerConnection, media data from one browser to another can even pass through NAT and firewalls. To successfully transmit media streams, participants must exchange the following data using a transport such as web sockets:

the initiating participant sends to the second participant an Offer-SDP (data structure, with the characteristics of the media stream that it will transmit);
the second participant generates a “response” - Answer-SDP and sends it to the initiator;
then, an exchange of ICE candidates is organized between the participants, if any are found (if the participants are behind NAT or firewalls).

After the successful completion of this exchange between the participants, the transfer of media streams (audio and video) is organized directly.

RTC Data Channel. Support for the Data Channel protocol appeared in browsers relatively recently, so this API can only be considered in cases where WebRTC is used in Mozilla browsers Firefox 22+ and Google Chrome 26+. With it, participants can exchange text messages in the browser.

WebRTC connection

Supported desktop browsers

Google Chrome (17+) and all browsers based on the Chromium engine;
Mozilla Firefox (18+);
Opera (12+);
Safari (11+);

Supported mobile browsers for Android

Google Chrome (28+);
Mozilla Firefox (24+);
Opera Mobile (12+);
Safari (11+).

WebRTC, Microsoft and Internet Explorer

For a very long time, Microsoft was silent about WebRTC support in Internet Explorer and in his new Edge browser. The guys from Redmond don't really like to put technology in the hands of users that they don't control, that's the kind of policy. But gradually things got off the ground, because. It was no longer possible to ignore WebRTC, and the ORTC project, derived from the WebRTC standard, was announced.

According to the developers, ORTC is an extension of the WebRTC standard with an improved set of APIs based on JavaScript and HTML5, which, translated into ordinary language, means that everything will be the same, only Microsoft, not Google, will control the standard and its development. The set of codecs has been expanded with support for H.264 and some G.7XX series audio codecs used in telephony and hardware video conferencing systems. Perhaps there will be built-in support for RDP (for transferring content) and messaging. By the way, Internet Explorer users are out of luck, ORTC support will only be in Edge. And, of course, such a set of protocols and codecs fits in with Skype for Business with little blood, which opens up even more business applications for WebRTC.