llama.cpp 编译安装@Ubuntu

在Kylin 和Ubuntu编译llama.cpp ，具体参考：llama模型c语言推理@FreeBSD-CSDN博客

下载代码并编译：

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir build
cd build
cmake ..
cmake --build . --config Release

# 可选安装
make install 

# 或可选添加路径
export PATH=/home/skywalk/github/llama.cpp/build/bin:$PATH

下载llama 模型文件 1.3G 百度网盘请输入提取码

转换模型

转换前先安装python库：

pip install sentencepiece

转换

python convert.py ~/work/model/chinesellama/

执行：

main -m ~/work/model/chinesellama/ggml-model-f16.gguf  -p "请写一个简单的python hello world例子"

效果，至少速度是很快的：

纯cpu的ubuntu：

你好，在不同领域中，相互之间的合作和交流。，实现共同目标。" 。 [end of text]

llama_print_timings:        load time =   11357.71 ms
llama_print_timings:      sample time =       1.53 ms /    21 runs   (    0.07 ms per token, 13734.47 tokens per second)
llama_print_timings: prompt eval time =      94.89 ms /     3 tokens (   31.63 ms per token,    31.62 tokens per second)
llama_print_timings:        eval time =    1215.98 ms /    20 runs   (   60.80 ms per token,    16.45 tokens per second)
llama_print_timings:       total time =    1327.15 ms /    23 tokens

有些问题回答的还挺不错：

hello ,can you write a python hello world demo program? 程序。（注意：在编写程序时需要注意代码的可读性和可维护性）。

在编写代码时，需要注意代码的可读性、可维护性、可扩展性等方面。在编写代码时，可以使用以下方法来提高代码的可读性、可维护性、可扩展性：

1. 使用有意义的变量名和函数名。
2. 避免使用过长的代码和重复的代码，尽量使用简洁的表达。
3. 使用注释来解释代码的逻辑和实现方式，以便于他人理解。
4. 使用测试用例来验证代码的正确性，并及时修复错误。
5. 使用适当的编程技巧和方法来提高代码的可读性和可维护性。
6. 在代码中加入一定的代码规范和风格指南，以提高代码的可读性和可维护性。
7. 使用版本控制工具来管理代码，如Git等。

代码的可读性和可维护性对于代码的长期维护和可扩展性非常重要。代码的可读性、可维护性和可扩展性决定了代码的可读性、可理解性、可维护性和可扩展性。因此，在编写代码时，我们应该尽可能地使代码易于阅读、易于维护和易于扩展。 [end of text]

llama_print_timings:        load time =     224.27 ms
llama_print_timings:      sample time =      20.63 ms /   283 runs   (    0.07 ms per token, 13715.23 tokens per second)
llama_print_timings: prompt eval time =     242.40 ms /    13 tokens (   18.65 ms per token,    53.63 tokens per second)
llama_print_timings:        eval time =   17749.69 ms /   282 runs   (   62.94 ms per token,    15.89 tokens per second)
llama_print_timings:       total time =   18198.66 ms /   295 tokens

riscv Kylin下是因为sentencepiece这个库没装上，没法本地转换模型，就把amd64 ubuntu里的模型拷贝过来了，测试下来速度确实慢，不过怎么自己聊起天来了？还是挺有趣的：

太慢了，没有输出全：

hello ,can you write a python hello world demo program? ？。」

In order to generate a Python hello world, you need to follow these steps:
1. Start by selecting the desired language.
2. Construct the sentence by creating a sentence that includes all the original thoughts and ideas.

In order to generate a Python hello world, you need to follow these steps:

1. Choose the language - Python, as the language it's most popular and widely used.
2. Construct the sentence by creating a sentence that includes all the original thoughts and ideas.

Here's an example:

Original sentence:
"Hello, my name is John, and I'm a little boy. I want to go to school every day.

Constructing the sentence:
"Hello, my name is John, and I'm a little boy. I want to go to school every day.

3. Create a sentence that

但至少是很好的尝试，不管是RISCV还是X86，都可以编译安装和使用llama.cpp！

调试

报错

/usr/bin/cmake: /home/skywalk/py310/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/ld: cannot find /lib64/libpthread.so.0: No such file or directory

针对libcurl.so.4的报错，删掉一个就行了：

ls libcur*
libcurand.so libcurand.so.10.3.0.86 libcurl.so.4.8.0
libcurand.so.10 libcurl.so libcurl.so.4

rm libcurl.so.4

针对libpthread.so.0

sudo apt install libpthread*