Stay Hungry,Stay Foolish!

webrtc && aiortc

WebRTC_API

https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API

WebRTC (Web Real-Time Communication) is a technology that enables Web applications and sites to capture and optionally stream audio and/or video media, as well as to exchange arbitrary data between browsers without requiring an intermediary. The set of standards that comprise WebRTC makes it possible to share data and perform teleconferencing peer-to-peer, without requiring that the user install plug-ins or any other third-party software.

WebRTC consists of several interrelated APIs and protocols which work together to achieve this. The documentation you'll find here will help you understand the fundamentals of WebRTC, how to set up and use both data and media connections, and more.

WebRTC concepts and usage

WebRTC serves multiple purposes; together with the Media Capture and Streams API, they provide powerful multimedia capabilities to the Web, including support for audio and video conferencing, file exchange, screen sharing, identity management, and interfacing with legacy telephone systems including support for sending DTMF (touch-tone dialing) signals. Connections between peers can be made without requiring any special drivers or plug-ins, and can often be made without any intermediary servers.

Connections between two peers are represented by the RTCPeerConnection interface. Once a connection has been established and opened using RTCPeerConnection, media streams (MediaStreams) and/or data channels (RTCDataChannels) can be added to the connection.

Media streams can consist of any number of tracks of media information; tracks, which are represented by objects based on the MediaStreamTrack interface, may contain one of a number of types of media data, including audio, video, and text (such as subtitles or even chapter names). Most streams consist of at least one audio track and likely also a video track, and can be used to send and receive both live media or stored media information (such as a streamed movie).

You can also use the connection between two peers to exchange arbitrary binary data using the RTCDataChannel interface. This can be used for back-channel information, metadata exchange, game status packets, file transfers, or even as a primary channel for data transfer.

Interoperability

WebRTC is in general well supported in modern browsers, but some incompatibilities remain. The adapter.js library is a shim to insulate apps from these incompatibilities.

 

 

https://www.apizee.com/what-is-webrtc.php

What is WebRTC?

 

WebRTC is an open-source project that enables real-time communication and data transfer across browsers and devices. It makes real-time voice, text, and—importantly—video communication possible.

 

One of the key components of WebRTC is that it allows browsers and clients to communicate directly without having to send information to an intermediary server. It effectively cuts out the middle step so that data is kept between the users.

 

This is different from a bidirectional communication protocol like WebSockets, where real-time communication goes through the server before it reaches the end client.

 
WebSockets vs WebRTC scheme
 

The World Wide Web Consortium (W3C) developed browser-based WebRTC to allow secure, direct communication between browsers and devices. Most popular browsers (e.g., Google Chrome, Apple Safari, Mozilla Firefox, and Microsoft Edge) support this technology.

 

 

 

https://telecom.altanai.com/category/web-realtimecomm-webrtc/webrtc-standards/

 

 

https://www.researchgate.net/figure/WebRTC-Datachannel-Establish-Flows_fig6_308671323

 

 

P2P EXAMPLE

https://webrtc.github.io/samples/src/content/peerconnection/pc1/

OTHER

https://webrtc.github.io/samples/

 

aiortc

https://github.com/aiortc/aiortc

What is aiortc?

aiortc is a library for Web Real-Time Communication (WebRTC) and Object Real-Time Communication (ORTC) in Python. It is built on top of asyncio, Python's standard asynchronous I/O framework.

The API closely follows its Javascript counterpart while using pythonic constructs:

  • promises are replaced by coroutines
  • events are emitted using pyee.EventEmitter

To learn more about aiortc please read the documentation.

Why should I use aiortc?

The main WebRTC and ORTC implementations are either built into web browsers, or come in the form of native code. While they are extensively battle tested, their internals are complex and they do not provide Python bindings. Furthermore they are tightly coupled to a media stack, making it hard to plug in audio or video processing algorithms.

In contrast, the aiortc implementation is fairly simple and readable. As such it is a good starting point for programmers wishing to understand how WebRTC works or tinker with its internals. It is also easy to create innovative products by leveraging the extensive modules available in the Python ecosystem. For instance you can build a full server handling both signaling and data channels or apply computer vision algorithms to video frames using OpenCV.

Furthermore, a lot of effort has gone into writing an extensive test suite for the aiortc code to ensure best-in-class code quality.

 

中文教程

https://zhuanlan.zhihu.com/p/387772163

 

 

 

DEMO(默认与aiohttp集成)

https://github.com/aiortc/aiortc/tree/main/examples/server

REACT VERSION

https://github.com/silverbulletmdc/py_webrtc_react_video_demo

 

与fastapi集成

提供docker compose部署

https://github.com/zoetaka38/fastapi-aiortc/tree/main

 

识别人脸、眼镜、笑容

https://github.com/DJWOMS/webrtc_opencv_fastapi/tree/main

import asyncio
import os
import cv2

from av import VideoFrame

from imageai.Detection import VideoObjectDetection

from aiortc import MediaStreamTrack, RTCPeerConnection, RTCSessionDescription
from aiortc.contrib.media import MediaPlayer, MediaRelay, MediaBlackhole

from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles

from starlette.requests import Request
from starlette.responses import HTMLResponse
from starlette.templating import Jinja2Templates

from src.schemas import Offer

ROOT = os.path.dirname(__file__)

app = FastAPI()
app.mount("/static", StaticFiles(directory="static"), name="static")
templates = Jinja2Templates(directory="templates")

faces = cv2.CascadeClassifier(cv2.data.haarcascades+"haarcascade_frontalface_default.xml")
eyes = cv2.CascadeClassifier(cv2.data.haarcascades+"haarcascade_eye.xml")
smiles = cv2.CascadeClassifier(cv2.data.haarcascades+"haarcascade_smile.xml")


class VideoTransformTrack(MediaStreamTrack):
    """
    A video stream track that transforms frames from an another track.
    """

    kind = "video"

    def __init__(self, track, transform):
        super().__init__()
        self.track = track
        self.transform = transform

    async def recv(self):
        frame = await self.track.recv()

        if self.transform == "cartoon":
            img = frame.to_ndarray(format="bgr24")

            # prepare color
            img_color = cv2.pyrDown(cv2.pyrDown(img))
            for _ in range(6):
                img_color = cv2.bilateralFilter(img_color, 9, 9, 7)
            img_color = cv2.pyrUp(cv2.pyrUp(img_color))

            # prepare edges
            img_edges = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
            img_edges = cv2.adaptiveThreshold(
                cv2.medianBlur(img_edges, 7),
                255,
                cv2.ADAPTIVE_THRESH_MEAN_C,
                cv2.THRESH_BINARY,
                9,
                2,
            )
            img_edges = cv2.cvtColor(img_edges, cv2.COLOR_GRAY2RGB)

            # combine color and edges
            img = cv2.bitwise_and(img_color, img_edges)

            # rebuild a VideoFrame, preserving timing information
            new_frame = VideoFrame.from_ndarray(img, format="bgr24")
            new_frame.pts = frame.pts
            new_frame.time_base = frame.time_base
            return new_frame
        elif self.transform == "edges":
            # perform edge detection
            img = frame.to_ndarray(format="bgr24")
            img = cv2.cvtColor(cv2.Canny(img, 100, 200), cv2.COLOR_GRAY2BGR)

            # rebuild a VideoFrame, preserving timing information
            new_frame = VideoFrame.from_ndarray(img, format="bgr24")
            new_frame.pts = frame.pts
            new_frame.time_base = frame.time_base
            return new_frame
        elif self.transform == "rotate":
            # rotate image
            img = frame.to_ndarray(format="bgr24")
            rows, cols, _ = img.shape
            M = cv2.getRotationMatrix2D((cols / 2, rows / 2), frame.time * 45, 1)
            img = cv2.warpAffine(img, M, (cols, rows))

            # rebuild a VideoFrame, preserving timing information
            new_frame = VideoFrame.from_ndarray(img, format="bgr24")
            new_frame.pts = frame.pts
            new_frame.time_base = frame.time_base
            return new_frame
        elif self.transform == "cv":
            img = frame.to_ndarray(format="bgr24")
            face = faces.detectMultiScale(img, 1.1, 19)
            for (x, y, w, h) in face:
                cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)

            eye = eyes.detectMultiScale(img, 1.1, 19)
            for (x, y, w, h) in eye:
                cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)

            # smile = smiles.detectMultiScale(img, 1.1, 19)
            # for (x, y, w, h) in smile:
            #     cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 5), 2)

            new_frame = VideoFrame.from_ndarray(img, format="bgr24")
            new_frame.pts = frame.pts
            new_frame.time_base = frame.time_base
            return new_frame
        else:
            return frame


def create_local_tracks(play_from=None):
    if play_from:
        player = MediaPlayer(play_from)
        return player.audio, player.video
    else:
        options = {"framerate": "30", "video_size": "1920x1080"}
        # if relay is None:
            # if platform.system() == "Darwin":
            # webcam = MediaPlayer(
            #     "default:none", format="avfoundation", options=options
            # )
            # elif platform.system() == "Windows":
            # webcam = MediaPlayer("video.mp4")
        webcam = MediaPlayer(
            "video=FULL HD 1080P Webcam", format="dshow", options=options
        )


            # else:
            # webcam = MediaPlayer("/dev/video0", format="v4l2", options=options)
            # audio, video = VideoTransformTrack(webcam.video, transform="cv")
        relay = MediaRelay()
        return None, relay.subscribe(webcam.video)


@app.get("/", response_class=HTMLResponse)
async def index(request: Request):
    return templates.TemplateResponse("index.html", {"request": request})


@app.get("/cv", response_class=HTMLResponse)
async def index(request: Request):
    return templates.TemplateResponse("index_cv.html", {"request": request})


@app.post("/offer")
async def offer(params: Offer):
    offer = RTCSessionDescription(sdp=params.sdp, type=params.type)

    pc = RTCPeerConnection()
    pcs.add(pc)
    recorder = MediaBlackhole()

    @pc.on("connectionstatechange")
    async def on_connectionstatechange():
        print("Connection state is %s" % pc.connectionState)
        if pc.connectionState == "failed":
            await pc.close()
            pcs.discard(pc)

    # open media source
    audio, video = create_local_tracks()

    # handle offer
    await pc.setRemoteDescription(offer)
    await recorder.start()

    # send answer
    answer = await pc.createAnswer()

    await pc.setRemoteDescription(offer)
    for t in pc.getTransceivers():
        if t.kind == "audio" and audio:
            pc.addTrack(audio)
        elif t.kind == "video" and video:
            pc.addTrack(video)

    await pc.setLocalDescription(answer)

    return {"sdp": pc.localDescription.sdp, "type": pc.localDescription.type}


@app.post("/offer_cv")
async def offer(params: Offer):
    offer = RTCSessionDescription(sdp=params.sdp, type=params.type)

    pc = RTCPeerConnection()
    pcs.add(pc)
    recorder = MediaBlackhole()

    relay = MediaRelay()

    @pc.on("connectionstatechange")
    async def on_connectionstatechange():
        print("Connection state is %s" % pc.connectionState)
        if pc.connectionState == "failed":
            await pc.close()
            pcs.discard(pc)

    # open media source
    # audio, video = create_local_tracks()

    @pc.on("track")
    def on_track(track):

        # if track.kind == "audio":
        #     pc.addTrack(player.audio)
        #     recorder.addTrack(track)
        if track.kind == "video":
            pc.addTrack(
                VideoTransformTrack(relay.subscribe(track), transform=params.video_transform)
            )
            # if args.record_to:
            #     recorder.addTrack(relay.subscribe(track))

        @track.on("ended")
        async def on_ended():
            await recorder.stop()

    # handle offer
    await pc.setRemoteDescription(offer)
    await recorder.start()

    # send answer
    answer = await pc.createAnswer()
    await pc.setRemoteDescription(offer)
    await pc.setLocalDescription(answer)

    return {"sdp": pc.localDescription.sdp, "type": pc.localDescription.type}


pcs = set()
args = ''


@app.on_event("shutdown")
async def on_shutdown():
    # close peer connections
    coros = [pc.close() for pc in pcs]
    await asyncio.gather(*coros)
    pcs.clear()

 

 

OTHER

https://github.com/tsonglew/webrtc-stream

https://github.com/jtboing/liveness-web-demo/tree/master

https://github.com/AllanGallop/Webcam_Censoring-webRTC-YOLOv3/tree/master

 

posted @ 2024-12-15 22:38  lightsong  阅读(11)  评论(0编辑  收藏  举报
Life Is Short, We Need Ship To Travel