模型推理为什么一上 Flash Decoding 就开始长上下文更快却短请求收益有限:从 Split-K 到 Reduction Window 的工程实战
2026/6/1 19:20:57
在使用 Elasticsearch 进行混合检索(Hybrid Search)时,理解文档得分的计算过程至关重要。特别是当结合向量检索(KNN)和传统文本检索(Query String)时,分数计算会变得更加复杂。本文将通过一个真实的查询案例,深入解析 ES 的explainAPI 用法,以及混合查询场景下的分数计算机制。
假设我们有一个医学文献检索系统,需要根据用户查询 “vaccine development” 检索相关文献。我们使用了以下混合查询策略:
{"min_score":0.8,"query":{"bool":{"must":[{"query_string":{"fields":["title_tks^10","title_sm_tks^5","important_kwd^30","important_tks^20","content_ltks^2","content_sm_ltks"],"type":"best_fields","query":"((\"vaccine development\" OR \"vaccin develop\"))","boost":1,"minimum_should_match":"60%"}}],"boost":0.05}},"from":0,"size":50,"knn":{"field":"q_vec","k":50,"similarity":0.8,"num_candidates":100,"query_vector":[/* 768维向量 */],"filter":{"bool":{"must":[{"query_string":{"fields":["title_tks^10","title_sm_tks^5","important_kwd^30","importan