2024 年 9月随笔档案 - lightsong

视频流推送和拉去

摘要：流服务器项目 https://github.com/tiangolo/nginx-rtmp-docker https://github.com/search?q=nginx%20rtmp&type=repositories RTSP 和 RTMP原理&推拉流 https://blog.csdn.ne 阅读全文

posted @ 2024-09-30 10:21 lightsong 阅读(12) 评论(0) 推荐(0) 编辑

yolov5视频流实时检测实现

摘要：yolov5 https://github.com/ultralytics/yolov5 对rtsp视频流的支持 https://github.com/ultralytics/yolov5/blob/master/detect.py @smart_inference_mode() def run( 阅读全文

posted @ 2024-09-30 09:17 lightsong 阅读(492) 评论(0) 推荐(0) 编辑

Top 100+ Generative AI Applications / Use Cases in 2024

摘要：Top 100+ Generative AI Applications / Use Cases in 2024 https://research.aimultiple.com/generative-ai-applications/#general-generative-ai-applications 阅读全文

posted @ 2024-09-25 16:17 lightsong 阅读(14) 评论(0) 推荐(0) 编辑

视觉多模态大模型case study

摘要：360智脑 https://aiot.360.cn/solutions/factory https://zhuanlan.zhihu.com/p/633755589 1、开放目标检测：自然语言输入快速完成数据标注一些安防巡店场景下，摄像头会出现被遮挡、发生偏移、镜头对向室外等人为干扰现象，因此，3 阅读全文

posted @ 2024-09-25 10:42 lightsong 阅读(129) 评论(0) 推荐(0) 编辑

Large Vision Model

摘要：LVM https://yutongbai.com/lvm.html https://zhuanlan.zhihu.com/p/671423679 Large Vision Model（简称LVM）是一种纯粹基于视觉数据进行训练和推理的大型模型，其特点在于无需涉及任何自然语言输入或输出。该模型的提出阅读全文

posted @ 2024-09-24 19:04 lightsong 阅读(120) 评论(0) 推荐(0) 编辑

llm integration framework

摘要：llm overview https://research.aimultiple.com/llmops-tools/ integration framework is our target. Architecture -- llm-app-stack https://github.com/a16z- 阅读全文

posted @ 2024-09-22 17:56 lightsong 阅读(16) 评论(0) 推荐(0) 编辑

面向真实监控场景的多模态视频理解

摘要：面向真实监控场景的多模态视频理解 https://mp.weixin.qq.com/s/3iPeKtqVEKvWpOb_pqEOXA 3. 多模态异常检测在监控视频领域，常用到多模态异常检测这一技术。传统的异常检测主要关注视频画面的大规模变化或异常行为，如打架或车祸等。随着技术进步，特别是 GPT 阅读全文

posted @ 2024-09-21 22:22 lightsong 阅读(201) 评论(0) 推荐(0) 编辑

RAG能解决大模型的什么问题？不能解决什么问题？

摘要：RAG OVERVIEW https://opendatascience.com/getting-started-with-multimodal-retrieval-augmented-generation/ What is RAG? RAG is an architectural framewor 阅读全文

posted @ 2024-09-21 14:28 lightsong 阅读(148) 评论(0) 推荐(0) 编辑

LANGCHAIN component

摘要：https://www.cnblogs.com/88223100/p/LangChain-the-hottest-LLM-application-development-framework-for-beginners.html 以 GPT 模型为例： 1.数据滞后，现在训练的数据是到 2021 年阅读全文

posted @ 2024-09-20 22:44 lightsong 阅读(6) 评论(0) 推荐(0) 编辑

LLM DATASET

摘要：大模型的能力来源 https://arxiv.org/pdf/2402.18041 大模型合规来源 https://arxiv.org/html/2402.12193v2 大模型的罪恶检测来源 https://www.kaggle.com/datasets/odins0n/ucf-crime-dat 阅读全文

posted @ 2024-09-20 22:33 lightsong 阅读(19) 评论(0) 推荐(0) 编辑

Phi-2: The surprising power of small language models

摘要：Phi-2: The surprising power of small language models https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models 阅读全文

posted @ 2024-09-20 21:52 lightsong 阅读(22) 评论(0) 推荐(0) 编辑

Pixtral 12B - the first-ever multimodal Mistral model.

摘要：Pixtral 12B - the first-ever multimodal Mistral model. https://mistral.ai/news/pixtral-12b/ Pixtral 12B in short: Natively multimodal, trained with in 阅读全文

posted @ 2024-09-20 21:29 lightsong 阅读(21) 评论(0) 推荐(0) 编辑

A lightweight python package， alternative of pyscafford

摘要：python_package https://github.com/fanqingsong/python_package Description A production ready python library template Metadata and dependency informatio 阅读全文

posted @ 2024-09-20 18:12 lightsong 阅读(6) 评论(0) 推荐(0) 编辑

LLM multiple modal applications

摘要：MoneyPrinterTurbo https://github.com/harry0703/MoneyPrinterTurbo/tree/main 利用AI大模型，一键生成高清短视频 Generate short videos with one click using AI LLM. FunCli 阅读全文

posted @ 2024-09-17 11:11 lightsong 阅读(9) 评论(0) 推荐(0) 编辑

crewAI-examples

摘要：crewAI-examples github demo 文生图完成读取文章，配图，发tweet。 https://github.com/wassim249/xgrow/blob/master/main.py#L23 文生图 https://github.com/coderphonui/crewai 阅读全文

posted @ 2024-09-16 23:27 lightsong 阅读(127) 评论(0) 推荐(0) 编辑

autoGPT metagpt crewAI langgraph autogen camel 哪些框架适用于多模态场景？（文心一言）

摘要：autoGPT metagpt crewAI langgraph autogen camel 哪些框架适用于多模态场景？特点：CrewAI是一个专门用于创建多模态代理的技术，能够同时处理文本、图像和音频数据。它提供了构建多模态代理所需的工具和库，使得开发者能够更容易地集成不同模型以处理多种数据类型阅读全文

posted @ 2024-09-16 11:08 lightsong 阅读(344) 评论(0) 推荐(0) 编辑

Zero-Shot，One-Shot，Few-Shot，In-Context Learning

摘要：Zero-Shot，One-Shot，Few-Shot，In-Context Learning https://blog.csdn.net/weixin_44212848/article/details/139902394 In-Context Learning定义：In-context learn 阅读全文

posted @ 2024-09-16 09:25 lightsong 阅读(269) 评论(0) 推荐(0) 编辑

awesome-ai-agents

摘要：awesome-ai-agents https://datawhalechina.github.io/hugging-multi-agent/chapter2/AIAgent%E7%9F%A5%E8%AF%86%E4%BD%93%E7%B3%BB%E7%BB%93%E6%9E%84/#213-sy1 阅读全文

posted @ 2024-09-15 22:04 lightsong 阅读(41) 评论(0) 推荐(0) 编辑

Large Multimodal Agents: A Survey

摘要：Large Multimodal Agents: A Survey https://arxiv.org/pdf/2402.15116 阅读全文

posted @ 2024-09-15 21:03 lightsong 阅读(10) 评论(0) 推荐(0) 编辑

Comparing Multi-agent AI frameworks

摘要：Comparing Multi-agent AI frameworks https://sajalsharma.com/posts/overview-multi-agent-fameworks/ A Comparative Overview To better understand the diff 阅读全文

posted @ 2024-09-15 19:10 lightsong 阅读(110) 评论(0) 推荐(0) 编辑

Tenacity -- Retrying library for Python

摘要：Retrying library for Python https://github.com/jd/tenacity Please refer to the tenacity documentation for a better experience. Tenacity is an Apache 2 阅读全文

posted @ 2024-09-15 18:42 lightsong 阅读(13) 评论(0) 推荐(0) 编辑

ReAct && MRKL

摘要：ReAct https://learnprompting.org/docs/advanced_applications/react What is ReAct? ReAct1 (Reason + Act) is a paradigm that enables language models to s 阅读全文

posted @ 2024-09-15 17:32 lightsong 阅读(23) 评论(0) 推荐(0) 编辑

Agentic workflow of LLM

摘要：Agentic Design Patterns https://www.deeplearning.ai/the-batch/how-agents-can-improve-llm-performance/?ref=dl-staging-website.ghost.io Reflection: The 阅读全文

posted @ 2024-09-15 16:52 lightsong 阅读(11) 评论(0) 推荐(0) 编辑

Efficient DevSecOps Workflows with a Little Help from AI

摘要：Efficient DevSecOps Workflows with a Little Help from AI https://www.infoq.com/articles/efficient-devsecops-workflows/ AI is enhancing DevSecOps workf 阅读全文

posted @ 2024-09-15 15:54 lightsong 阅读(8) 评论(0) 推荐(0) 编辑

GGUF大模型文件格式

摘要：GGUF大模型文件格式 https://www.datalearner.com/blog/1051705718835586 大语言模型的开发通常使用PyTorch等框架，其预训练结果通常也会保存为相应的二进制格式，如pt后缀的文件通常就是PyTorch框架保存的二进制预训练结果。但是，大模型的存储阅读全文

posted @ 2024-09-14 16:16 lightsong 阅读(321) 评论(0) 推荐(1) 编辑

ragflow

摘要：ragflow https://github.com/infiniflow/ragflow RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understandi 阅读全文

posted @ 2024-09-14 16:09 lightsong 阅读(181) 评论(0) 推荐(0) 编辑

开源视觉大模型

摘要：overview publisher name 支持图片支持视频 github 测试网址 COMMENT OpenGVLab InternVL YES https://github.com/OpenGVLab/InternVL https://internvl.opengvlab.com/ Ope 阅读全文

posted @ 2024-09-13 22:46 lightsong 阅读(148) 评论(0) 推荐(0) 编辑

Base/chat/instruct in LLM

摘要：Base/chat/instruct https://blog.csdn.net/qq_43127132/article/details/140447880 大模型库中的base、chat、instruct和4bit通常指的是不同类型或配置的预训练语言模型。它们的区别主要在于训练目标、用途和模型参数阅读全文

posted @ 2024-09-10 22:52 lightsong 阅读(439) 评论(0) 推荐(0) 编辑

大模型支持能力

摘要：支持能力 https://arxiv.org/pdf/2402.06196 能力增强方法当我们谈论大模型时，应该关注哪些新能力？ https://www.thepaper.cn/newsDetail_forward_22829654 能力一：涌现能力（emergent abilities）涌现能阅读全文

posted @ 2024-09-10 22:44 lightsong 阅读(36) 评论(0) 推荐(0) 编辑

大模型的两个重要能力（IF + FC）

摘要：MiniCPM https://github.com/OpenBMB/MiniCPM 面壁智能推出的大模型，在如下方面支持能出众。推理长文本 RAG 都是常见的能力。其中指令遵从(IF=instruction follow) 和工具调用(FC = function call)，威力强大，阅读全文

posted @ 2024-09-09 23:14 lightsong 阅读(157) 评论(0) 推荐(0) 编辑

transformer->多模态

摘要：Transformer (language) https://www.cnblogs.com/kongen/p/18088002 https://www.infoq.cn/article/qbloqm0rf*sv6v0jmulf https://arxiv.org/pdf/2402.06196 ht 阅读全文

posted @ 2024-09-08 22:39 lightsong 阅读(223) 评论(0) 推荐(0) 编辑

data-analysis-llm-agent

摘要：data-analysis-llm-agent https://github.com/fanqingsong/data-analysis-llm-agent Conversational AI with Function Calling for Data Analysis Overview The 阅读全文

posted @ 2024-09-08 19:48 lightsong 阅读(18) 评论(0) 推荐(0) 编辑

智能眼镜

摘要：华为 huawei-eyewear-2 https://consumer.huawei.com/cn/audio/huawei-eyewear-2/ https://new.qq.com/rain/a/20240515A06AXH00 续航11小时，连接华为手机和平板语音助手、播放音乐、接挂电话阅读全文

posted @ 2024-09-02 21:05 lightsong 阅读(9) 评论(0) 推荐(0) 编辑

多模态大模型

摘要：多模态大模型（MLLMs）是一类结合了大型语言模型（LLMs）的自然语言处理能力与对其他模态（如视觉、音频等）数据的理解与生成能力的模型。旨在通过整合文本、图像、声音等多种类型的输入和输出，提供更加丰富和自然的交互体验。 A Survey on Multimodal Large Language 阅读全文

posted @ 2024-09-01 20:57 lightsong 阅读(1464) 评论(0) 推荐(0) 编辑

Stay Hungry,Stay Foolish!

lightsong

{Web: [React, Vue, NodeJS, HTTP]，DevOps:[Jenkins,Docker,K8S], Languages:[Python, JS, C, Lua, Shell, Groovy]}

09 2024 档案

公告

搜索

常用链接

最新随笔

我的标签

积分与排名

随笔档案 (1118)

文章档案 (1)

阅读排行榜

评论排行榜

推荐排行榜

最新评论