跳转到主要内容

标签(标签)

资源精选(342) Go开发(108) Go语言(103) Go(99) angular(83) LLM(78) 大语言模型(63) 人工智能(53) 前端开发(50) LangChain(43) golang(43) 机器学习(39) Go工程师(38) Go程序员(38) Go开发者(36) React(34) Go基础(29) Python(24) Vue(23) Web开发(20) Web技术(19) 精选资源(19) 深度学习(19) Java(18) ChatGTP(17) Cookie(16) android(16) 前端框架(13) JavaScript(13) Next.js(12) 安卓(11) 聊天机器人(10) typescript(10) 资料精选(10) NLP(10) 第三方Cookie(9) Redwoodjs(9) ChatGPT(9) LLMOps(9) Go语言中级开发(9) 自然语言处理(9) PostgreSQL(9) 区块链(9) mlops(9) 安全(9) 全栈开发(8) OpenAI(8) Linux(8) AI(8) GraphQL(8) iOS(8) 软件架构(7) RAG(7) Go语言高级开发(7) AWS(7) C++(7) 数据科学(7) 智能体(6) whisper(6) Prisma(6) 隐私保护(6) JSON(6) DevOps(6) 数据可视化(6) wasm(6) 计算机视觉(6) 算法(6) Rust(6) 微服务(6) 隐私沙盒(5) FedCM(5) 语音识别(5) Angular开发(5) 快速应用开发(5) 提示工程(5) Agent(5) LLaMA(5) 低代码开发(5) Go测试(5) gorm(5) REST API(5) kafka(5) 推荐系统(5) WebAssembly(5) GameDev(5) CMS(5) CSS(5) machine-learning(5) 机器人(5) 游戏开发(5) Blockchain(5) Web安全(5) nextjs(5) Kotlin(5) 低代码平台(5) 机器学习资源(5) Go资源(5) Nodejs(5) PHP(5) Swift(5) devin(4) Blitz(4) javascript框架(4) Redwood(4) GDPR(4) 生成式人工智能(4) Angular16(4) Alpaca(4) 编程语言(4) SAML(4) JWT(4) JSON处理(4) Go并发(4) 移动开发(4) 移动应用(4) security(4) 隐私(4) spring-boot(4) 物联网(4) 网络安全(4) API(4) Ruby(4) 信息安全(4) flutter(4) RAG架构(3) 专家智能体(3) Chrome(3) CHIPS(3) 3PC(3) SSE(3) 人工智能软件工程师(3) LLM Agent(3) Remix(3) Ubuntu(3) GPT4All(3) 软件开发(3) 问答系统(3) 开发工具(3) 最佳实践(3) RxJS(3) SSR(3) Node.js(3) Dolly(3) 移动应用开发(3) 低代码(3) IAM(3) Web框架(3) CORS(3) 基准测试(3) Go语言数据库开发(3) Oauth2(3) 并发(3) 主题(3) Theme(3) earth(3) nginx(3) 软件工程(3) azure(3) keycloak(3) 生产力工具(3) gpt3(3) 工作流(3) C(3) jupyter(3) 认证(3) prometheus(3) GAN(3) Spring(3) 逆向工程(3) 应用安全(3) Docker(3) Django(3) R(3) .NET(3) 大数据(3) Hacking(3) 渗透测试(3) C++资源(3) Mac(3) 微信小程序(3) Python资源(3) JHipster(3) 语言模型(2) 可穿戴设备(2) JDK(2) SQL(2) Apache(2) Hashicorp Vault(2) Spring Cloud Vault(2) Go语言Web开发(2) Go测试工程师(2) WebSocket(2) 容器化(2) AES(2) 加密(2) 输入验证(2) ORM(2) Fiber(2) Postgres(2) Gorilla Mux(2) Go数据库开发(2) 模块(2) 泛型(2) 指针(2) HTTP(2) PostgreSQL开发(2) Vault(2) K8s(2) Spring boot(2) R语言(2) 深度学习资源(2) 半监督学习(2) semi-supervised-learning(2) architecture(2) 普罗米修斯(2) 嵌入模型(2) productivity(2) 编码(2) Qt(2) 前端(2) Rust语言(2) NeRF(2) 神经辐射场(2) 元宇宙(2) CPP(2) 数据分析(2) spark(2) 流处理(2) Ionic(2) 人体姿势估计(2) human-pose-estimation(2) 视频处理(2) deep-learning(2) kotlin语言(2) kotlin开发(2) burp(2) Chatbot(2) npm(2) quantum(2) OCR(2) 游戏(2) game(2) 内容管理系统(2) MySQL(2) python-books(2) pentest(2) opengl(2) IDE(2) 漏洞赏金(2) Web(2) 知识图谱(2) PyTorch(2) 数据库(2) reverse-engineering(2) 数据工程(2) swift开发(2) rest(2) robotics(2) ios-animation(2) 知识蒸馏(2) 安卓开发(2) nestjs(2) solidity(2) 爬虫(2) 面试(2) 容器(2) C++精选(2) 人工智能资源(2) Machine Learning(2) 备忘单(2) 编程书籍(2) angular资源(2) 速查表(2) cheatsheets(2) SecOps(2) mlops资源(2) R资源(2) DDD(2) 架构设计模式(2) 量化(2) Hacking资源(2) 强化学习(2) flask(2) 设计(2) 性能(2) Sysadmin(2) 系统管理员(2) Java资源(2) 机器学习精选(2) android资源(2) android-UI(2) Mac资源(2) iOS资源(2) Vue资源(2) flutter资源(2) JavaScript精选(2) JavaScript资源(2) Rust开发(2) deeplearning(2) RAD(2)

A curated list of awesome embedding models tutorials, projects and communities. Please feel free to pull requests to add links.

Table of Contents

Papers

Word Embeddings

Word2vec, GloVe, FastText

  • Efficient Estimation of Word Representations in Vector Space (2013), T. Mikolov et al. [pdf]
  • Distributed Representations of Words and Phrases and their Compositionality (2013), T. Mikolov et al. [pdf]
  • word2vec Parameter Learning Explained (2014), Xin Rong [pdf]
  • word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method (2014), Yoav Goldberg, Omer Levy [pdf]
  • GloVe: Global Vectors for Word Representation (2014), J. Pennington et al. [pdf]
  • Improving Word Representations via Global Context and Multiple Word Prototypes (2012), EH Huang et al. [pdf]
  • Enriching Word Vectors with Subword Information (2016), P. Bojanowski et al. [pdf]
  • Bag of Tricks for Efficient Text Classification (2016), A. Joulin et al. [pdf]

Language Model

  • Semi-supervised sequence tagging with bidirectional language models (2017), Peters, Matthew E., et al. [pdf]
  • Deep contextualized word representations (2018), Peters, Matthew E., et al. [pdf]
  • Contextual String Embeddings for Sequence Labeling (2018), Akbik, Alan, Duncan Blythe, and Roland Vollgraf. [pdf]
  • BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (2018), [pdf]

Embedding Enhancement

  • Sentence Embedding:Learning Semantic Sentence Embeddings using Pair-wise Discriminator(2018),Patro et al.[Project Page] [Paper]
  • Retrofitting Word Vectors to Semantic Lexicons (2014), M. Faruqui et al. [pdf]
  • Better Word Representations with Recursive Neural Networks for Morphology (2013), T.Luong et al. [pdf]
  • Dependency-Based Word Embeddings (2014), Omer Levy, Yoav Goldberg [pdf]
  • Not All Neural Embeddings are Born Equal (2014), F. Hill et al. [pdf]
  • Two/Too Simple Adaptations of Word2Vec for Syntax Problems (2015), W. Ling[pdf]

Comparing count-based vs predict-based method

  • Linguistic Regularities in Sparse and Explicit Word Representations (2014), Omer Levy, Yoav Goldberg[pdf]
  • Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors (2014), M. Baroni [pdf]
  • Improving Distributional Similarity with Lessons Learned from Word Embeddings (2015), Omer Levy [pdf]

Evaluation, Analysis

  • Evaluation methods for unsupervised word embeddings (2015), T. Schnabel [pdf]
  • Intrinsic Evaluation of Word Vectors Fails to Predict Extrinsic Performance (2016), B. Chiu [pdf]
  • Problems With Evaluation of Word Embeddings Using Word Similarity Tasks (2016), M. Faruqui [pdf]
  • Improving Reliability of Word Similarity Evaluation by Redesigning Annotation Task and Performance Measure (2016), Oded Avraham, Yoav Goldberg [pdf]
  • Evaluating Word Embeddings Using a Representative Suite of Practical Tasks (2016), N. Nayak [pdf]

Phrase, Sentence and Document Embeddings

Sentence

Document

Sense Embeddings

Neural Language Models

Researchers

Courses and Lectures

Datasets

Training

Evaluation

Pre-Trained Language Models

Below is pre-trained ELMo models. Adding ELMo to existing NLP systems significantly improves the state-of-the-art for every considered task.

Below is pre-trained sent2vec models.

Pre-Trained Word Vectors

Convenient downloader for pre-trained word vectors:

Links for pre-trained word vectors:

Implementations and Tools

Word2vec

GloVe

原文:https://github.com/Hironsan/awesome-embedding-models