This article is a mirror article of machine translation, please click here to jump to the original article.

View: 1610|Reply: 2

[AI] (5) Print out the Ollama request information

[Copy link]
Posted on 2025-2-6 21:48:36 | | | |
Requirements: I deployed the DeepSeek-R1 model using Ollama and wanted to view the request information from some plugins to understand the details. For example: Open WebUI, continue, cline, Roo Code, etc.

Review:

【AI】(3) Tencent Cloud Deploys DeepSeek-R1 with HAI tutorial
https://www.itsvse.com/thread-10931-1-1.html

[AI] (4) Use Open WebUI to call the DeepSeek-R1 model
https://www.itsvse.com/thread-10934-1-1.html

In order to print out the input request on the server side, you need to enable Debug mode. edit/etc/systemd/system/ollama.service.d/override.conffile, add the following configuration:


Reload and start the ollama service with the following command:


Use journalctl to view the service output logs with the following command:


Use Open WebUI to call ollama for testing, as shown in the image below:



The logs are as follows:

Feb 06 21:25:48 VM-0-8-ubuntu ollama[13503]: [GIN] 2025/02/06 - 21:25:48 | 200 |  6.186257471s |      172.18.0.2 | POST     "/api/chat"
Feb 06 21:25:48 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:25:48.411+08:00 level=DEBUG source=sched.go:407 msg="context for request finished"
Feb 06 21:25:48 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:25:48.411+08:00 level=DEBUG source=sched.go:339 msg="runner with non-zero duration has gone idle, adding timer" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 duration=2562047h47m16.854775807s
Feb 06 21:25:48 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:25:48.411+08:00 level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 refCount=0
Feb 06 21:25:54 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:25:54.834+08:00 level=DEBUG source=sched.go:575 msg="evaluating already loaded" model=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93
Feb 06 21:25:54 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:25:54.835+08:00 level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt=<|User|>My name is Xiao Zha, who are you? <|Assistant|>
Feb 06 21:25:54 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:25:54.836+08:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=728 prompt=13 used=2 remaining=11
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: [GIN] 2025/02/06 - 21:26:02 | 200 |  7.642182053s |      172.18.0.2 | POST     "/api/chat"
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:02.454+08:00 level=DEBUG source=sched.go:407 msg="context for request finished"
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:02.454+08:00 level=DEBUG source=sched.go:339 msg="runner with non-zero duration has gone idle, adding timer" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 duration=2562047h47m16.854775807s
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:02.454+08:00 level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 refCount=0
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:02.491+08:00 level=DEBUG source=sched.go:575 msg="evaluating already loaded" model=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:02.491+08:00 level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt="<|User|>### Task:\nGenerate a concise, 3-5 word title with an emoji summarizing the chat history.\n### Guidelines:\n- The title should clearly represent the main theme or subject of the conversation.\ n- Use emojis that enhance understanding of the topic, but avoid quotation marks or special formatting.\n- Write the title in the chat's primary language; default to English if multilingual.\n- Prioritize accuracy over excessive creativity; keep it clear and simple.\n### Output:\nJSON format: { \"title\": \"your concise title here\" }\n### Examples:\n- { \"title\": \" Stock Market Trends\" },\n- { \"title\": \" Perfect Chocolate Chip Recipe\" },\n- { \"title\": \"Evolution of Music Streaming\" },\n- { \"title\": \"Remote Work Productivity Tips\" },\n- { \"title\": \"Artificial Intelligence in Healthcare\" },\n- { \" title\": \" Video Game Development Insights\" }\n### Chat History:\n<chat_history>\nUSER: My name is Xiao Zha, who are you? \nASSISTANT: Hello, little scumbag! I'm DeepSeek-R1-Lite-Preview, an intelligent assistant developed by DeepSeek, and I'll do my best to help you. Is there anything I can do for you? \n</chat_history><|Assistant|>"
Feb 06 21:26:02 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:02.495+08:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=1 cache=567 prompt=312 used=6 remaining= 306
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: [GIN] 2025/02/06 - 21:26:14 | 200 | 12.263297485s |      172.18.0.2 | POST     "/api/chat"
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:14.731+08:00 level=DEBUG source=sched.go:407 msg="context for request finished"
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:14.731+08:00 level=DEBUG source=sched.go:339 msg="runner with non-zero duration has gone idle, adding timer" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 duration=2562047h47m16.854775807s
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:14.731+08:00 level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 refCount=0
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:14.769+08:00 level=DEBUG source=sched.go:575 msg="evaluating already loaded" model=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:14.769+08:00 level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt="<|User|>### Task:\nGenerate 1-3 broad tags categorizing the main themes of the chat history, along with 1-3 more specific subtopic tags.\n\n### Guidelines:\n- Start with high-level domains (e.g. Science, Technology, Philosophy, Arts, Politics, Business, Health, Sports, Entertainment, Education)\n- Consider including relevant subfields/subdomains if they are strongly represented throughout the conversation\n- If content is too short (less than 3 messages) or too diverse, use only [\"General\"]\n- Use the chat's primary language; default to English if multilingual\n- Prioritize accuracy over specificity\n\n### Output:\nJSON format: { \"tags\": [\"tag1\", \"tag2\", \"tag3\"] }\n\n### Chat History:\n<chat_history>\nUSER: My name is Xiao Zha, who are you? \nASSISTANT: Hello, little scumbag! I'm DeepSeek-R1-Lite-Preview, an intelligent assistant developed by DeepSeek, and I'll do my best to help you. Is there anything I can do for you? \n</chat_history><|Assistant|>"
Feb 06 21:26:14 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:14.773+08:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=1 cache=637 prompt=249 used=7 remaining= 242
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:17.717+08:00 level=DEBUG source=sched.go:575 msg="evaluating already loaded" model=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:17.718+08:00 level=DEBUG source=server.go:966 msg="new runner detected, loading model for cgo tokenization"
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: loaded meta data with 26 key-value pairs and 771 tensors from /data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 (version GGUF V3 (latest))
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv   0:                       general.architecture str              = qwen2
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv   1:                               general.type str              = model
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv   2:                               general.name str              = DeepSeek R1 Distill Qwen 32B
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv   3:                           general.basename str              = DeepSeek-R1-Distill-Qwen
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv   4:                         general.size_label str              = 32B
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv   5:                          qwen2.block_count u32              = 64
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv   6:                       qwen2.context_length u32              = 131072
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv   7:                     qwen2.embedding_length u32              = 5120
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv   8:                  qwen2.feed_forward_length u32              = 27648
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv   9:                 qwen2.attention.head_count u32              = 40
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  10:              qwen2.attention.head_count_kv u32              = 8
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  11:                       qwen2.rope.freq_base f32              = 1000000.000000
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  12:     qwen2.attention.layer_norm_rms_epsilon f32              = 0.000010
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  13:                          general.file_type u32              = 15
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  14:                       tokenizer.ggml.model str              = gpt2
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  15:                         tokenizer.ggml.pre str              = deepseek-r1-qwen
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  16:                      tokenizer.ggml.tokens arr[str,152064]  = ["!", "\"", "#", "$", "%", "&", "'", ...
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  17:                  tokenizer.ggml.token_type arr[i32,152064]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  18:                      tokenizer.ggml.merges arr[str,151387]  = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  19:                tokenizer.ggml.bos_token_id u32              = 151646
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  20:                tokenizer.ggml.eos_token_id u32              = 151643
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  21:            tokenizer.ggml.padding_token_id u32              = 151643
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  22:               tokenizer.ggml.add_bos_token bool             = true
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  23:               tokenizer.ggml.add_eos_token bool             = false
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  24:                    tokenizer.chat_template str              = {% if not add_generation_prompt is de...
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - kv  25:               general.quantization_version u32              = 2
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - type  f32:  321 tensors
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - type q4_K:  385 tensors
Feb 06 21:26:17 VM-0-8-ubuntu ollama[13503]: llama_model_loader: - type q6_K:   65 tensors
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_vocab: missing or unrecognized pre-tokenizer type, using: 'default'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_vocab: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_vocab: special tokens cache size = 22
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_vocab: token to piece cache size = 0.9310 MB
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: format           = GGUF V3 (latest)
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: arch             = qwen2
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: vocab type       = BPE
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: n_vocab          = 152064
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: n_merges         = 151387
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: vocab_only       = 1
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: model type       = ? B
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: model ftype      = all F32
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: model params     = 32.76 B
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: model size       = 18.48 GiB (4.85 BPW)
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: general.name     = DeepSeek R1 Distill Qwen 32B
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: BOS token        = 151646 '<|begin▁of▁sentence|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: EOS token        = 151643 '<|end▁of▁sentence|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: EOT token        = 151643 '<|end▁of▁sentence|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: PAD token        = 151643 '<|end▁of▁sentence|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: LF token         = 148848 'ÄĬ'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: FIM PRE token    = 151659 '<|fim_prefix|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: FIM SUF token    = 151661 '<|fim_suffix|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: FIM MID token    = 151660 '<|fim_middle|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: FIM PAD token    = 151662 '<|fim_pad|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: FIM REP token    = 151663 '<|repo_name|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: FIM SEP token    = 151664 '<|file_sep|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: EOG token        = 151643 '<|end▁of▁sentence|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: EOG token        = 151662 '<|fim_pad|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: EOG token        = 151663 '<|repo_name|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: EOG token        = 151664 '<|file_sep|>'
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llm_load_print_meta: max token length = 256
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: llama_model_load: vocab only - skipping tensors
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:18.440+08:00 level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt="<|User|>My name is Xiao Zha, who are you? <|Assistant|>\nHello, little scumbag! I'm DeepSeek-R1-Lite-Preview, an intelligent assistant developed by DeepSeek, and I'll do my best to help you. Is there anything I can do for you? <|end of sentence|><|User|>Hello DeepSeek-R1<|Assistant|>"
Feb 06 21:26:18 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:18.491+08:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=223 prompt=64 used=13 remaining= 51
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: [GIN] 2025/02/06 - 21:26:24 | 200 |  6.737131375s |      172.18.0.2 | POST     "/api/chat"
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:24.426+08:00 level=DEBUG source=sched.go:407 msg="context for request finished"
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:24.426+08:00 level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 refCount=1
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: [GIN] 2025/02/06 - 21:26:24 | 200 | 10.172441322s |      172.18.0.2 | POST     "/api/chat"
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:24.918+08:00 level=DEBUG source=sched.go:407 msg="context for request finished"
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:24.918+08:00 level=DEBUG source=sched.go:339 msg="runner with non-zero duration has gone idle, adding timer" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 duration=2562047h47m16.854775807s
Feb 06 21:26:24 VM-0-8-ubuntu ollama[13503]: time=2025-02-06T21:26:24.918+08:00 level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/data/ollama/models/blobs/sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 refCount=0

Reference:
The hyperlink login is visible.
The hyperlink login is visible.




Previous:[AI] (4) Use Open WebUI to call the DeepSeek-R1 model
Next:[AI] (6) A brief introduction to the large model file format GGUF
 Landlord| Posted on 2025-2-6 21:53:55 |
You can also edit the ollama service file using the following command:


When you run this command, a text editor (usually vi or nano) opens and lets you edit the /etc/systemd/system/ollama.service file.
 Landlord| Posted on 2025-2-7 09:08:25 |
Linux looks at the log output of the Systemd service
https://www.itsvse.com/thread-10154-1-1.html
Disclaimer:
All software, programming materials or articles published by Code Farmer Network are only for learning and research purposes; The above content shall not be used for commercial or illegal purposes, otherwise, users shall bear all consequences. The information on this site comes from the Internet, and copyright disputes have nothing to do with this site. You must completely delete the above content from your computer within 24 hours of downloading. If you like the program, please support genuine software, purchase registration, and get better genuine services. If there is any infringement, please contact us by email.

Mail To:help@itsvse.com