Milvus 安装

来自牛奶河Wiki
阿奔讨论 | 贡献2024年8月5日 (一) 14:32的版本 →‎V1 Sample
跳到导航 跳到搜索

Milvus is an open-source vector database that brings search to GenAI applications.

Milvus was selected as the vector database of choice (over Chroma and Pinecone). Milvus is an open-source vector database designed specifically for similarity search on massive datasets of high-dimensional vectors.

Milvus supports Python, Java, C++.

Milvus 2

使用 Helm Charts 离线安装 Milvus

下载资源

  • 在本地添加和更新 Milvus Helm 存储库。
helm repo add milvus https://milvus-io.github.io/milvus-helm/
helm repo update
  • 独立部署
helm template my-release --set cluster.enabled=false --set etcd.replicaCount=1 --set minio.mode=standalone --set pulsar.enabled=false milvus/milvus > milvus_manifest.yaml
  • 集群部署
helm template my-release milvus/milvus > milvus_manifest.yaml
  • 自定义
wget https://raw.githubusercontent.com/milvus-io/milvus-helm/master/charts/milvus/values.yaml   # 修改
helm template -f values.yaml my-release milvus/milvus > milvus_manifest.yaml
  • 下载需求和脚本文件
wget https://raw.githubusercontent.com/milvus-io/milvus/master/deployments/offline/requirements.txt
wget https://raw.githubusercontent.com/milvus-io/milvus/master/deployments/offline/save_image.py
  • 拉取并保存镜像
pip3 install -r requirements.txt
python3 save_image.py --manifest milvus_manifest.yaml
# 需要安装 pyyaml, docker
  • 加载镜像
cd images/for image in $(find . -type f -name "*.tar.gz") ; do gunzip -c $image | docker load --input; done

Helm Inst

kubectl apply -f milvus_manifest.yaml
# kubectl delete -f milvus_manifest.yaml

Milvus 1

Docker Inst

docker pull milvusdb/milvus:cpu-latest
# 设置配置文件和工作目录
/u01/milvus/conf
  : conf      # server_config.yaml
  : db        # 索引与向量存储
  : logs      # 日志
  : wal       # 预写式日志
docker run -td --name mymilvus -e "TZ=Asia/Shanghai" -p 19530:19530 -p 19121:19121 \
 -v /u01/milvus/db:/var/lib/milvus/db \
 -v /u01/milvus/wal:/var/lib/milvus/wal \
 -v /u01/milvus/logs:/var/lib/milvus/logs \
 -v /u01/milvus/conf:/var/lib/milvus/conf \
 milvusdb/milvus:cpu-latest
# docker ps
CONTAINER ID   IMAGE                        COMMAND                  CREATED         STATUS          PORTS                                                                                          NAMES
f8e350986ef4   milvusdb/milvus:cpu-latest   "/tini -- /var/lib/m…"   4 seconds ago   Up 3 seconds   0.0.0.0:19121->19121/tcp, :::19121->19121/tcp,  0.0.0.0:19530->19530/tcp, :::19530->19530/tcp   milvusdev
322f00c39f82   registry                     "/entrypoint.sh /etc…"   2 weeks ago     Up 13 days     0.0.0.0:8000->5000/tcp, :::8000->5000/ tcp                                                      uhry

# docker logs f8e350986ef4
    __  _________ _   ____  ______    
   /  |/  /  _/ /| | / / / / / __/    
  / /|_/ // // /_| |/ / /_/ /\ \    
 /_/  /_/___/____/___/\____/___/     

Welcome to use Milvus!
Milvus Release version: v1.1.1, built at 2021-06-15 14:51.05, with OpenBLAS library.
You are using Milvus CPU edition
Last commit id: 330cc61bede475c4a7a71841d54e633586cea829

Loading configuration from: /var/lib/milvus/conf/server_config.yaml
NOTICE: You are using SQLite as the meta data management. We recommend change it to MySQL.
Supported CPU instruction sets: avx2, sse4_2
FAISS hook AVX2
Milvus server started successfully!

V1 Sample

milvus 版本 1.x 与 2.0 不兼容,pymilvus 也如此,且低版本的 pymilvus 不可安装在 python 高版本上,如 11。下面使用 pymilvus 1.1.0,安装在 python3.6 上。

# -*- coding: utf-8 -*-
import numpy as np
from milvus import Milvus, MetricType

HOST='192.168.0.242'
PORT=19530

milvus = Milvus(host=HOST, port=PORT)

# create table
num_vec = 5000
vec_dim = 768
collection_name = "demo1"
collection_param = {
'collection_name': collection_name,
'dimension': vec_dim,
'index_file_size': 32,
'metric_type': MetricType.IP
}
milvus.create_collection(collection_param)

# Generate random data
vectors_array = np.random.rand(num_vec, vec_dim)

# Insert DB
status, ids = milvus.insert(collection_name=collection_name, records=vectors_array)   # 返回状态和这一组向量的ID
milvus.flush([collection_name])

print(milvus.get_collection_stats(collection_name))

# QUERY
query_vec_array = np.random.rand(1, vec_dim)
status, results = milvus.search(collection_name=collection_name, query_records=query_vec_array, top_k=5)
print(status)
print(results)

# Drop Table
# status = milvus.drop_collection(collection_name)

milvus.close()